Translation technology | 04.12.2017

Machine Translation: Another Step Forward

Machine translation (MT) is nothing new, with the first live demonstration done back in the 1950s. But as Google Translate and similar products prove, MT output has become usable and accessible for anyone with a reliable Internet connection. This technology (usually enterprise-level machine translation engines) is used extensively in the translation industry for certain types of content – usually less creative texts – in order to meet pressure on costs and deadlines. The accessibility of the internet, along with content-driven marketing, has led to a huge increase in content publication, which, in conjunction with an economy that is becoming more and more globalised, has led to companies needing ever greater volumes of translation. However, pressure on budgets has led to the evolution of MT with post-editing being a real option for many companies: translations produced in this way, coupled with human post-editors can (in the right circumstances) produce human quality at lower rates than humans alone. Now the release of the latest generation – neural machine translation – is making MT one of the translation industry’s hottest buzzwords.

How did MT start?

The first MT engines were rules-based (called ‘RBMT’). They looked at languages’ individual words and grammar rules – what does each part of a text represent and what is the equivalent in the foreign language? RBMT replaced each word or grammar feature in language A with the equivalent word or grammar feature in language B. This doesn’t quite work though. Anyone who knows more than one language will tell you that finding a simple word-for-word translation isn’t always easy. Imagine you want to translate the word “runway” into French, Spanish and Dutch. It sounds simple: piste for French, pista for Spanish. For Dutch? Not simple. Is the plane taking off or landing? If it’s taking off, say startbaan; when it lands, it’s a landingsbaan. This is just one example for one language; every language has its own eccentricities that make MT difficult.

What came next?

After RBMT, statistical machine translation (SMT) came along and looked at slightly larger units. More than words, but less than sentences. Short phrases that it would then run against a massive database of existing translations before finding the most likely translation. Sound better? Well, yes. But still not amazing. Translators don’t translate parts of sentences and glue them together. SMT offers better results than RBMT but mistakes still abound, and it only works for certain content. The output is better for general documents (correspondence, reviews, etc) but output for more creative content like marketing is most often unusable.

And now…?

The current hot topic in the translation technology world is neural machine translation (NMT), which looks at ever greater units of language – entire sentences. And not just looking at sentences but learning from them. Instead of MT developers deciding which linguistic features an MT engine should focus on when training data, NMT studies a database of existing translations (like SMT) and learns and decides for itself. It looks at words and how important they are, how they relate to each other within the larger context (and all translators will tell you how important context is) and applies these findings to new texts for translation.

So, what’s after NMT?

After SMT dominated chat amongst MT geeks around the water cooler for the last decade, NMT has no doubt been the translation industry’s 2017 buzzword. However, MT developers are already talking about what’s next: Deep NMT. Whole new layers of coding combining with client-approved glossaries and other resources, Deep NMT is set to produce even better translations. The work involved in creating a Deep NMT engine is more involved than ‘shallow’ NMT, but the rewards reaped are expected to be much greater.

With all these developments, is SMT obsolete now?

The development of NMT and Deep NMT is exciting, but it’s not the solution to all MT problems. NMT can produce some great translations and is more fluent that SMT. It’s better for more creative texts- say, for marketing- and if the engine has been built using existing databases of very specific content-say, for user manuals or other technical documents- it can do much better than SMT. But if you’re looking to translate general content then SMT, at least for now, is out-performing NMT.

Well, with all these advances, can we ditch human translators now?

The short answer is no. NMT and Deep NMT are off to very promising starts with NMT output more fluent than SMT translations, but we are still a long way away from heralding in a Post-Human Translator era as some MT output remains difficult, even impossible, to understand and still contains numerous mistakes. The human translator’s role remains vital: MT output is good and getting better by the year, but MT must still be post-edited by a human translator, somebody with a knowledge of the source and target languages and, of course, the subject area.

In the end, MT is not about eliminating the human translator but helping them. It’s another arrow in the translator’s quiver to increase productivity, providing ready-to-edit translations to deliver quality at an ever-increasing pace.

If you would like more information on the possibilities offered by Machine Translation, please get in touch and one of our Account Managers will be ready to help. Email us at [email protected] or give us a call on 0161 850 0060.

Name	Host	Duration	Type	Description
NID	google.com	6 months	Third Party	This cookie is used by Google to create a profile based on user’s interest and display personalised ads to the users.
_gid	google.com	1 day	Third Party	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number visitors, the source where they have come from, and the pages visited in an anonymous form.
_ga	google.com	2 years	Third Party	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site’s analytics report. The cookies store information anonymously and assigns a randomly generated number to identify unique visitors.
MR	microsoft.com	1 week	Third Party	This cookie is used to measure the use of the website for analytics purposes.
MUID	microsoft.com	1 year	Third Party	Used by Microsoft as a unique identifier. The cookie is set by embedded Microsoft scripts. The purpose of this cookie is to synchronise the ID across many different Microsoft domains to enable user tracking.
IDE	google.com	2 years	Third Party	Used by Google DoubleClick and stores information about how the user uses the website and any other advertisement before visiting the website. This is used to present users with ads that are relevant to them according to the user profile.
GPS	youtube.com	30 minutes	Third Party	This cookie is set by Youtube and registers a unique ID for tracking users based on their geographical location
VISITOR_INFO1_LIVE	youtube.com	5 months	Third Party	This cookie is set by Youtube. Used to track the information of the embedded YouTube videos on a website.

Name	Host	Duration	Type	Description
_gat_UA-5518708-1	google.com	1 minute	Third Party	This is a pattern type cookie set by Google Analytics, where the pattern element on the name contains the unique identity number of the account or website it relates to. It appears to be a variation of the _gat cookie which is used to limit the amount of data recorded by Google on high traffic volume websites.
__cfduid	cloudflare.com	1 month	Third Party	The cookie is set by CloudFare. The cookie is used to identify individual clients behind a shared IP address to apply security settings on a per-client basis. It does not correspond to any user ID in the web application and does not store any personally identifiable information.
viewed_cookie_policy	thetranslationpeople.com	11 month	Self	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not the user has consented to the use of cookies. It does not store any personal data.
cookielawinfo-checkbox-necessary	thetranslationpeople.com	11 month	Self	This cookie is set by the GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Necessary".
YSC	youtube.com	1 month	Third Party	This cookies is set by Youtube and is used to track the views of embedded videos.

Machine Translation: Another Step Forward

Need help with a translation? Get in touch with us

Need help with a translation?
Get in touch with us