Legend of Zelda: Machine Translated

For the past few months, me and my cohorts, Kyle Chow, Linka Wade, and Xingyue Zhang, have been working on a project to see just how well a neural machine translation (NMT) engine can translate a modern game.

Our game of choice was Legend of Zelda: Breath of the Wild. A popular, well-translated, and rather fresh take on the very famous Legend of Zelda franchise. Our language pair of choice was Japanese to English.

The project spanned a month, during which we used both Microsoft Custom Translator (Azure) and Systran.

We used the BLEU (BiLingual Evaluation Understudy) model – which rated each training round out of 100 – in order to determine how much our translations progressed or regressed as we went about training our MT Engline.

Breath of the Wild and all related images are a trademark of Nintendo of America Inc.. Use of these assets falls under fair use and is not being used for monetary gain.

Process:

In order to train our two MT Engines, we fed training data and dictionaries to each machine (Systran and Azure). With Azure, we also had the choice to feed tuning data, which, as its name suggests, allows the machine to tune its translations with specific, well trained segments. Systran did not have this option.

As we obtained our scores, we adjusted each next training round to assess what kinds of data we should next feed our machines.

Proposal:

You can see our pilot proposal pdf by clicking on the image above

As you can see with our proposal, our project was quite the ambitious one. Game translations, as a literary medium, are difficult to translate with machines, as often, the things written are not exactly literal.

In our proposal, we wran through our objective, timeline, datasets, processes, expected costs, and anticipated deliverables.

After we finished training our MT engine, we then ran through and rated our results then added what we discovered into an updated proposal. We also wrote about our successes and our failures.

In this proposal, we also included recommendations for future training to make our work more efficient.

Finally, we chose which one we liked better, Systran or Azure.

As we ran through our tests, we were at first very disheartened when our project had a BLEU score below 8 points. Wow, was that bad. But then we remembered that our scores are meaningless in any context except for in our project. What mattered was the improvements to that base score, not what the original score was.

And as we saw our score steadily climb, I felt invigorated at the changes.

My own personal contribution: recommending that we take training segments from more than just Legend of Zelda ended up having a pretty good impact on our score, which made me very happy.

I wouldn’t have minded continuing this project even after our deadline.

Click on the image above to view a pdf of our updated proposal.

Lessons Learned

Sometimes, you just gotta beat it over the head.

I learned as a project manager that, while machine translation is still evolving, it has a long way to go before it will be up to the task of translating a game like Breath of the Wild — and even most other games — without a lot of post-editing.

Machines are very close to being able to translate technical things perfectly. In our project, we noticed that much of the explanations were easy to understand and well translated overall. After all, many of these tips are straight to the point. However, as it stands, it is more efficient to translate it properly, with translators and not machines.

Some things I learned in the context of our project specifically is that, segment alignment is critical when the sentences are split up, as Japanese and English sentence structures are very different. Preflight and proper formatting is also very important, as having to manually fix each sentence can be extremely time consuming. Also, it is very important not to get trapped in one source of training. Eventually that line of data will stagnate.

Finally, good context can do wonders to a good translation. Never underestimate visuals.

Japanese is much more subtle in its language and requires a lot more context to understand than English, which has a lot more detail packed into each sentence. When we translate literally, the game loses a lot of its charm.

There is a lot of creativity within the language of videogames. I’ve realized that translating these works of art takes a lot of transcreation. Props to the translators who gave us such a fantastic game.

お疲れみんな!
Check out our lessons learned video below!

Until next time!