Machine Translation: Another Step Forward

04,Dec,2017

Machine translation (MT) is nothing new, with the first live demonstration done back in the 1950s. But as Google Translate and similar products prove, MT output has become usable and accessible for anyone with a reliable Internet connection. This technology (usually enterprise-level machine translation engines) is used extensively in the translation industry for certain types of content – usually less creative texts – in order to meet pressure on costs and deadlines. The accessibility of the internet, along with content-driven marketing, has led to a huge increase in content publication, which, in conjunction with an economy that is becoming more and more globalised, has led to companies needing ever greater volumes of translation. However, pressure on budgets has led to the evolution of MT with post-editing being a real option for many companies: translations produced in this way, coupled with human post-editors can (in the right circumstances) produce human quality at lower rates than humans alone. Now the release of the latest generation – neural machine translation – is making MT one of the translation industry’s hottest buzzwords.

How did MT start?

The first MT engines were rules-based (called ‘RBMT’). They looked at languages’ individual words and grammar rules – what does each part of a text represent and what is the equivalent in the foreign language? RBMT replaced each word or grammar feature in language A with the equivalent word or grammar feature in language B. This doesn’t quite work though. Anyone who knows more than one language will tell you that finding a simple word-for-word translation isn’t always easy. Imagine you want to translate the word “runway” into French, Spanish and Dutch. It sounds simple: piste for French, pista for Spanish. For Dutch? Not simple. Is the plane taking off or landing? If it’s taking off, say startbaan; when it lands, it’s a landingsbaan. This is just one example for one language; every language has its own eccentricities that make MT difficult.

What came next?

After RBMT, statistical machine translation (SMT) came along and looked at slightly larger units. More than words, but less than sentences. Short phrases that it would then run against a massive database of existing translations before finding the most likely translation. Sound better? Well, yes. But still not amazing. Translators don’t translate parts of sentences and glue them together. SMT offers better results than RBMT but mistakes still abound, and it only works for certain content. The output is better for general documents (correspondence, reviews, etc) but output for more creative content like marketing is most often unusable.

And now…?

The current hot topic in the translation technology world is neural machine translation (NMT), which looks at ever greater units of language – entire sentences. And not just looking at sentences but learning from them. Instead of MT developers deciding which linguistic features an MT engine should focus on when training data, NMT studies a database of existing translations (like SMT) and learns and decides for itself. It looks at words and how important they are, how they relate to each other within the larger context (and all translators will tell you how important context is) and applies these findings to new texts for translation.

So, what’s after NMT?

After SMT dominated chat amongst MT geeks around the water cooler for the last decade, NMT has no doubt been the translation industry’s 2017 buzzword. However, MT developers are already talking about what’s next: Deep NMT. Whole new layers of coding combining with client-approved glossaries and other resources, Deep NMT is set to produce even better translations. The work involved in creating a Deep NMT engine is more involved than ‘shallow’ NMT, but the rewards reaped are expected to be much greater.

With all these developments, is SMT obsolete now?

The development of NMT and Deep NMT is exciting, but it’s not the solution to all MT problems. NMT can produce some great translations and is more fluent that SMT. It’s better for more creative texts- say, for marketing- and if the engine has been built using existing databases of very specific content-say, for user manuals or other technical documents- it can do much better than SMT. But if you’re looking to translate general content then SMT, at least for now, is out-performing NMT.

Well, with all these advances, can we ditch human translators now?

The short answer is no. NMT and Deep NMT are off to very promising starts with NMT output more fluent than SMT translations, but we are still a long way away from heralding in a Post-Human Translator era as some MT output remains difficult, even impossible, to understand and still contains numerous mistakes. The human translator’s role remains vital: MT output is good and getting better by the year, but MT must still be post-edited by a human translator, somebody with a knowledge of the source and target languages and, of course, the subject area.

In the end, MT is not about eliminating the human translator but helping them. It’s another arrow in the translator’s quiver to increase productivity, providing ready-to-edit translations to deliver quality at an ever-increasing pace.

If you would like more information on the possibilities offered by Machine Translation, please get in touch and one of our Account Managers will be ready to help. Email us at info@thetranslationpeople.com or give us a call on 0161 850 0060.