Why Should We Know Whether We Can Trust MT?

Why is quality important? Since we introduced a technology that can estimate the reliability and quality of machine translation, many people have asked me why it is important to know the quality of the machine translations. I’d like to compare the situation to a more concrete situation, namely to a car and its parts. Think about the situation where you can buy car parts or components that are made by either human professionals or automatic machines. Continue reading Why Should We Know Whether We Can Trust MT?

About Estimating the Quality of Machine Translations

Estimating machine translation quality is difficult.It is usually very difficult to estimate or evaluate the quality of automatic translations. The main reason for this is that typically the user translates a piece of text either from or to a language that he doesn’t understand good enough. (If he did understand, then he wouldn’t need to translate it.) Thus he cannot really know whether the translation contains errors or not.

The traditional way to get an understanding of the machine translation quality has involved comparing the machine translation with a translation made by a professional translator. This has been done mainly for research and development purposes by the developers of machine translation technology. However, this kind of method is not useful in real-life situations because usually there isn’t available any professional translation for material which needs to be machine translated.

There is definitely a demand for a technology for automatic estimation of the quality of machine translations. The general idea behind such a technology would be creating a system which would automatically tell the user whether some machine translation is good or bad or something between. This kind of technology would benefit us in several ways:

  1. With reliable machine translation quality estimates automatic translations can be used in more demanding situations. Currently machine translation is often used for translating information where translations errors would cause only minimal or no harm at all. This has seriously limited the use of automatic translation.
  2. Bad translations can be filtered out of a post-editing process. The post-editing can be made more efficiently because the post-editor does not need the spend time with useless translations.
  3. Deciding whether the machine translation is good enough for publishing will become easier.
  4. The system can be used to select the best among several machine translations. This will obviously improve the overall perceived machine translation quality as well as avoid embarrassing mistakes.

Lately we at Multilizer have made significant steps in estimating machine translation quality automatically. The quality estimation will be also one of the subjects in a scientific workshop about machine translation in Montreal at the beginning of June.


In Machine Translation Post-editing Quality Can Be Traded for Productivity

In association with Kites, Finnish machine translation experts and enthusiasts have formed Special Interest Group for machine translation in Finland. In one meeting we handled a hot subject of post-editing machine translations. Post-editing means that a professional translator checks and edits automatic translations which are made by a machine. The interesting topic with excellent presentations (one made by Jukka Outinen of Lionbridge and another by freelance translator Tommi Nieminen) sparked a lively discussion.

Machine translation post-editing can be efficient.One of the ideas highlighted during the meeting was that in traditional translation by a professional translator, it does not make much sense to lower one’s quality requirements. It does not improve productivity. A professional translator cannot choose to “write bad translations”. With professional translators, the style and fluency of the text come together with the translation. However, the situation is different when post-editing machine translations.

In post-editing machine translations it can make sense to lower quality requirements because that indeed improves productivity. When the task is to post-edit an automatic translation, the translator can choose not to correct those parts of the machine translation that are correct but written in a clumsy language. Thus the translator saves some time at the expense of the quality. Therefore, in post-editing machine translations the quality can indeed be traded for productivity.

This naturally changes the translation market. Clients can now choose between lower and higher quality, depending on his requirements and budget. Affordable, quick and good enough translations made by a machine and a man together are fulfilling the scale of available translation services. An increase in productivity will enlarge the entire translation industry.


Automatic Translator for PDF Documents
When Machine Translation Usefulness Is Higher Than Quality

Often usefulness of machine translation quality is regarded as the same thing as its quality. However, machine translation can be used in many ways. And in different use cases different things matter. Here we represent one case where translation’s usefulness depends only partially on the machine translation quality.

The example is about a newsletter which was machine translated to different languages.  The newsletter recipients were asked to evaluate both the quality of the translation and how useful they think the translation is. (Read the complete case study here.) The graph below shows the results.

Source: http://www.roi-learning.com/dvm/pubs/articles/tatc-24/

These results are quite interesting. The graph shows that although the machine translation quality was evaluated far from perfect, the translations’ usefulness was regarded as higher than its quality. However, this applies only when translation quality is above certain threshold. Bad or poor quality machine translations are naturally deemed as useless.

Although this case study is rather old, it seems that the respondents were rather tolerant towards translation errors. The quality of machine translation is nowadays quite different than it was in the beginning of the 21st century. It’s hard to tell what were the respondents general conceptions about the overall quality of machine translation. It is probable that today the use of free machine translation services has taught people to expect even less from machine translation.

The most important thing here is to realize that the quality (or the lack of it) doesn’t directly determine the usefulness of the machine translation. The quality of machine translation is more than just correct grammar.


Language Barriers Visualized [graph]

Globalization seems to be a phenomenon which is here to stay. For some people globalization is a possibility and for others it is a threat. The latter group says that globalization is going to kill local cultures, habits and languages. These people can now be less concerned about the issue, because language barriers still exist.

In the graph below you can see the language barriers in action. It shows how much or little separated each language is linked from each other. The graph tells for example how many of the sites in German language link to sites in French (0.01). We can draw interesting conclusions from the graph.

Click to enlarge. Source: http://googleresearch.blogspot.com/2011/07/languages-of-world-wide-web.html

We can assume that when a person visits an interesting site he or she shares the link with others in social media sites or in a blog posting or in other location. This means that if for example many French people visit German sites often there should be many links from French sites to German sites and vice versa. Likewise, if French people don’t visit German sites often there is only small number of links.

Like the graph illustrates, there are surprisingly little connections between languages. Even when people do understand other languages, they usually visit sites in their own language. For example, French people commonly study Spanish at schools. However, they don’t commonly visit Spanish web sites. And although Germans often study French, they don’t usually visit French sites. And the same applies to almost any other language.

There are couple of possible reasons for the lack of links between different languages. First, the content might already be translated to many languages, and people can choose in which language they read the content. People tend to choose the language they understand best. Secondly, people might find foreign language content uninteresting. This scenario is only theoretical because we cannot assume that the attractiveness of the content is related to the used language. Thirdly, people might not fully understand the content if it is written with other than their native language. If they don’t understand, they may not want to link to the content.

All this confirms the common opinion that you have to communicate in people’s own language to reach them. In the future, machine translation quality will surely improve and lower the language barrier.


