This article is originally published at translation-blog.multilizer.com.
Machine translation seems to be quite simple; you just write a text segment and the machine gives you a translation. From user’s point of view that really is enough, because for most of us the technology behind the tool is completely irrelevant as long as it translates good enough. However the “secrets” behind machine translators are indeed interesting. Have you ever thought how do they function?
Basically the machine behind an automatic machine translator can follow one of the two, most common methods. First of all, machine translator can utilize certain language related rules, like grammar and conjugation, with the help of a separate dictionary. These rules and dictionaries are integrated to the translator by human. Then the machine translator tries to read the dictionary and to apply the rules to all the text that it need to translate.
Secondly, machine translator can use a database as a source for all the information it needs for translating. This information may include a huge amount of textual information or data. The machine translator then uses the information to determine which words have similar meanings in different languages, which expressions are used most commonly, which are the possible word variations, etc. Typically these kind of machine translators count probabilities to find out the best translation.
In previous posts we have discussed about the quality of machine translation, and the process behind the machine translator can have a huge affect on the quality. On one hand, a translator which uses a database can learn from it, especially if more data is automatically collected constantly. Although a bigger database usually improves the quality, the quality of the database is extremely important. No matter how huge the database is if it contains only bad examples. Likewise if the database is not developed and updated on a regular basis, the translation quality won’t improve.
On the other hand the quality is more controlled if one teaches all the rules to the machine translator. However this method is quite laborious and it requires accuracy. Incorrect rules will lead to incorrect translations. Equally if the groundwork is done properly, the machine translator produces good quality. Anyway also these rules should be updated and reviewed to maintain the quality.
As a conclusion, it can be said that both these machine translation types have their pros and cons. Which one do you think is better?