In “Why zero-shot translation may be the most important MT development in localization”, Arle Lommel writes that zero-shot translation stands to be one of the most important developments in machine translation because it allows translation in language pairs without training data.
NMT vs. PbMT
As zero-shot translation (ZST) is a new capability acquired by Google’s neural machine translation system, Lommel first illustrates the differences between neural machine translation (NMT) and formerly state-of-the-art phrase-based (PbMT) statistical systems. The biggest advantage of NMT is that it can analyze more complex correlations to make the resulting translations more accurate.
Zero-Shot Translation vs. Pivot Translation
Lommel agrees that NMT represents a significant step forward in machine translation (MT) technology, but he believes that Google’s announcement about its deployment of zero-shot translation in Google Neural Machine Translation (GNMT) system may be the most important advance in the long run.
He defines that zero-shot translation is “the ability to translate in language pairs for which a system has not been trained”. To make it more clear, Lommel compares ZST with classic pivot translation and explains how they work by showing a Finnish <> Greek case.

Pivot vs. zero-shot translation. Source: Common Sense Advisory
There are few Finnish <> Greek training data, but a bunch of data for each of them in combination with English. In pivot translation with traditional PbMT technology, a text is translated into one language and then from that one into another (such as Finnish > English > Greek). Each engine can only deal with one language pair. However, Google’s NMT engine can simultaneously handle multiple language pairs because Google’s neural system feeds all training data into one engine.
ZST helps NMT systems work well in language pairs with insufficient training material because it makes use of all relevant data from all combinations. In the Finnish <> Greek case, pivot translation only uses data from two language pairs: Finnish <> English and English <> Greek, but ZST in an NMT system also leverages related data in four or five other languages. Lommel also adds a disadvantage of pivot translation: MT errors in the first language pair tend to compound in the second one.
The importance of Zero-Shot Translation
Lommel points out how important ZST is for under-resourced language pairs such as Finnish <> Greek, and for organizations like European Union, which would need 552 engines in a pivot-based process to provide access to its institutions through all of its 24 official languages. He proposes that ZST rather than NMT which still relies on human translation should be attached more importance to.
In September 2016, Google announced that Google Translate would switch to a new GNMT system. In November, GNMT system was extended to enable zero-shot translation. Lommel explains what ZST is and what impact it will have on language access. The differences between ZST and classic pivot translation shown in the Finnish <> Greek example help readers gain a better understanding. However, the article does not include how ZST works specifically. According to Google AI, the system is learning a common representation, which is interpreted as a sign of the existence of an interlingua in the network.
Lommel is a senior analyst for Common Sense Advisory (CSA Research), an independent market research company focusing on translation, interpreting, localization, globalization, and internationalization practices. In his another article about ZST on CSA website, he explains more about how Google’s multilingual neural engine works. “Google’s system does not use a pivot,” says Lommel. Instead, it uses all available data to move directly from one language to another.
Lommel believes that the tech press has paid too much attention to NMT but ignored the significance of ZST. From the title of this article and through the whole text, he repeats that ZST is truly disruptive in MT development. He claims: “Zero-shot translation will help fill the large gap where the alternative to MT is not the human translation but zero translation instead.”
Recent Comments