TLM-related Tools Exploration

Background

Although computer-assisted translation tools are commonly used in the translation and localization industry, they are not the only technology we should be familiar with. In the Advanced CAT class, we explored other technologies that can be used to boost our productivity. After dipping a toe into each technology, we completed a project for it to demonstrate our understanding.

Machine Translation: Microsoft Custom Translator

The first technology we explored was machine translation. Many people fear that machine translation will take over translators’ role in the near future. Therefore, the whole class was broken into groups to put Microsoft’s neural translation system to test with different domains.

Our Domain and Scope

We were aware that the more restricted the text is, the better the outcome of the machine translation will be. However, in order to know if the machine is capable of doing a human’s job, my group decided to go with news commentary, a text type that is not restricted in any way. Our goal was to train an engine to translate the New York Times economic and political news commentaries from English to simplified Chinese.

Pre-training Stage

Training data: News Commentary Parallel Corpus
Tuning data: human translated and manually aligned political and economic news commentaries from the New York Times.
Testing data: political and economic news commentaries from the New York Times.

We were not sure about the machine’s capability, so we set our initial goal to be producing understandable news commentaries, and this is our initial proposal for achieving the goal; we predicted that with a budget of $1040, the goal could be reached.

During Training

We found that it was easy for the engine to produce understandable articles. Therefore, we reset our goal to produce articles that resembled human translation. In other words, we were expecting high quality translations. This is our updated proposal.

After Training

We eventually stopped after 42 rounds of training, inputting a total of around 200 million characters in the engine. Our conclusion was it is extremely hard for an engine to produce quality translations on such an unrestricted text type. As shown in our presentation, the total amount of money spent was $3531.8 when we stopped the training.

After the project, I have a deeper understanding as to the capability of machines. With proper uses, machines can indeed save time and free up translators to carry out more sophisticated tasks. Nonetheless, the replacement of human by machines will not happen any time soon.

Trados QA Functionality: Regular Expression

QA is an important part of translation and localization projects. There are just too many aspects to pay attention to. If we can automate some of the processes, it would be beneficial.

Our second mini project was to explore regular expressions on regular expressions 101 and try to come up with specific rules that we could apply to our target language to speed up the QA process.

Here is the regular expressions we came up with for QAing translations from English to simplified Chinese and traditional Chinese. The screenshots showed that the regular expressions correctly identified errors that would have required human intervention.

If we can come up with comprehensive regular expressions for the target languages, the formatting check during a localization project cycle can be integrated into the editing stage.

Powerful Time Tracking Tool: Harvest

Keeping track of how much time we spend on projects can let us produce more precise estimates on future projects. As translation and localization professionals, people are certainly familiar with some free time tracking tools, such as TopTracker and Toggl. My final mini project is a demo of a powerful time tracking tool that includes not only the time tracking functionality but also expense tracking features that make invoicing easily. With this tool, a project manager can better organize the time and expenses.

Final Thoughts

We approached the translation and localization industry from many different aspects this semester, and that really broadens my horizons as to how complex this industry is. I will keep exploring new technologies so that I can keep up with the latest trends.