Localizing a French Children’s E-book

In the course of Multilingual Desktop publishing, I was excited to learn more about the real-world localization workflows of different products, gaining so much hands-on experience using Adobe Suite, Sigil, and SubSync. Among all the interesting topics, I like E-book localization the best. Localizing E-books into other languages is such a joyful adventure for book-lovers. I cannot wait to create my original E-books and localize them into my favorite languages.

The basic E-book Creation and localization workflows are demonstrated as follows. As a beginner, you will get familiar with the epub file type, and gradually become a master of Sigil.

My E-book Project 

For my showcase project, I chose a French Children‘s E-book titled Miquette baptise sa poupée, written by Albums Camo. My goal is to localize it into an English one for American kids to read. It includes many graphics, which is visually attractive for children, but a little bit challenging for localizers to choose proper fonts and deal with CSS issues. This project can integrate multiple tools together, including Sigil, Photoshop and CAT tools.

Epub Extraction 

When you start an epub localization project, the first thing is to extract the epub file into a bunch of components. It includes a “mimetype” file, a toc.ncx file, a content.opf file, a META-INF folder, an image folder and an HTML file.

File Preparation 

From the viewpoint of a future project manager, things always need to be well-organized in order to meet the high-quality standard and the potential deadline. When I prepare the files for translation, I separated different components of the epub file and marked them with different tools carefully.

Not to be Translated: 

META-INF folder, the “mimetype” file

To be Translated:

  1. the HTML file: CAT tool
  2. an image file (book cover): Photoshop
  3. the toc.ncx file, the content.opf file and metadata: Sigil

Normally, I should prepare a Picture List for metadata or text embedded in images. Since this E-book has very little text content for metadata and images, I chose to translate it directly.

Content Translation 

The text content of this E-book is all included in the HTML file. Previously, I used Memsource as my CAT tool to translate my HTML files. This time I tried something new. After the CEO of Lilt came to MIIS, I felt curious about their products and always want to give it a shot. This is a great opportunity for me. I chose Lilt as my CAT tool and explored a little bit about its NMT-based translation platform.

Here is a list of their supported file types for localization projects (HTML included)

After I uploaded my HTML file, I can translate the content using their default FR-EN translation memory, and use their own terminology Lexicon as a second reference. When the translation is completed, I exported the new HTML file.

Book Cover Localization

The title and the author’s name is embedded in the cover image. I chose Adobe Photoshop to change it into English. In the process, I used Marquee tool to erase the original text, and replaced it with English text in a cute font I newly installed (orange juice).

Dafonts is a good and free resource to find fonts that match your needs. It categorizes fonts according to languages and styles.

After I localized the cover and exported the new image, I replaced the original one with it in the image folder.

Pack Your Epub!

In this step, I already replaced the HTML file and the image. Now I pack the updated components using ePubPack. A new E-book in English is created.

This is a random page from the book, this book included many images, so I play around with CSS and adjust the composition to make the new content fit in.

Sigil Localization

There are still some tasks left to be done in Sigil. I opened my English E-book in Sigil, and opened the metadata editor. I changed the language and the title here. In the toc.ncx and the content.opf sections, a few things need to be changed as well.

Click here to see my slides for my presentation:

Presentation

The Wonderland of Translation Technologies II

 

 

In the wonderland of translation technologies, a good command of CAT tools is far from enough for a qualified industry practitioner. There are many other fascinating tools beyond the world of CAT tools that are gradually gaining attention from the crowd, such as implementing a localized WordPress site, Handling bilingual data, constructing QA models, training an MT engine, utilities for practitioners, etc. This blog will walk you through them one by one and give you a deeper understanding of the tech-oriented translation and localization industry.

 

Pseudo Translation 

You can never underestimate the significance of conducting a pseudo translation before diving into a project. It can test out many internationalization issues that worth your special attention as follows:

  • Over-externalization
  • Automatic replacement of translatable text identifies if the code is improperly included with the translatable text
  • Improper display of character sets
  • Text not included for translation
  • Text embedded in images
  • Unhandled text expansion and contraction
  • Concatenation

In one of my projects, I tried to export the XML file from the WordPress site and pseudo translate it in Trados. After I set up a proper filter configuration and conduct the pseudo translation, I successfully imported the new XML file into the site.

 

This process requires a good command of coding. When I handled the code in the bilingual XML files, I struggled to separate the untranslatable part from the translatable part and make it work smoothly in Trados and WordPress.

Although nowadays many CMSs have been connected and integrated with TMSs, and a lot of effort thus has been saved. The above-mentioned process still needs attention from localization engineers for more complicated localization tasks.

 

Collecting and Cleaning Bilingual Data 

When we use a great volume of bilingual data to train an MT engine, it is often a headache to maintain the quantity and quality of the data pool. In the era of big data, the collecting and cleaning work needs to be automated instead of manually conducted.

Data Collection: Many online corpora can be used to download the needed bilingual files, such as OPUS(Click here). There are many more corpora in terms of subject fields.

Data Cleaning: There are many tools can be implemented to clean your data in order to boost its quality. One handy tool is Olifant (Click here). As a TM Editor, Olifant leads you to import or export translation memories in different formats (such as TMX or tab-delimited). You can edit the translation units, their attributes, and any other associated data.

Data Alignment: Many tools can be used to align source language files to target language files, such as memoQ, YouAlign, TMXmall, etc.

 

QA Model 

QA metrics and models have been invented and developed by many institutes to track and measure the quality of translation in the industry. It can be used to train MT engines, as well as to score linguists in TMSs.

The well-known ones are MQM, LISADQF (Click here).

These QA models can be integrated into translation management systems or can be designed and developed into independent tools such as MQA Scorecard, Okapi Checkmate.

Also, there are a couple of scores designed by human and generated by the system to indicate the translation quality:

  • GAP Analysis: Unknown words to be added to engine Phrase Table
  • BLEU: Both vocabulary and fluency (are words in the right order)
  • F-Measure: Recall and precision: helps identify vocabulary problems (location and causes)
  • TER: Translation Error Distance

 

Theories and Implementations of Machine Translation 

Theories: RBMT, SMT, and NMT

There are three types of machine translation that are popular in the market for computer-assisted translation technologies, rule-based, statistical and neural machine translation. Briefly, their definitions are listed as follows:

  • Rule-based machine translation (RBMT) is machine translation systems based on linguistic information about the source and target languages basically retrieved from (unilingual, bilingual or multilingual) dictionaries and grammars covering the main semantic, morphological, and syntactic regularities of each language respectively. (Retrieved from Wikipedia)
  • Statistical machine translation (SMT) is a machine translation paradigm where translations are generated on the basis of statistical models whose parameters are derived from the analysis of bilingual text corpora. (Retrieved from Wikipedia)
  • Neural machine translation (NMT) is an approach to machine translation that uses a large artificial intelligence network to predict the likelihood of a sequence of words, typically modeling entire sentences in a single integrated model. (Retrieved from Wikipedia)

Here is an infographic to show the comparison among three types of MTs:

 

Implementation: Microsoft Translator Hub

There are several tools where you can customize your own MT engine. One of them is Microsoft Translator Hub. Here are two examples of how to use it to customize your own SMT engine:

Case Study –

Using the Microsoft Translator Hub at The Church of Jesus Christ of Latter-day Saints  (Click here)

Final Project – 

Statistic Machine Translation Training for SIPO (Click here)

 

Recently, Microsoft released its state-of-art function:

Customize your NMT engine (Click here to try it out)

 

Utilities for Practitioners

Here are some simple and handy tools you may consider to explore as a translation and localization practitioner. Don’t underestimate them, because they can actually save a lot of time and effort at your workplace:

Here is a demo on how to use Cmap. (Click here)

 

La fin: Trends in Translation Technology

The industry of translation and localization is changing and growing rapidly, it is equally important to keep up the trend in future as to grasp what has been developed in the market.

Video Localization: VideoLocalize

Mashups with NMT: AI tech: chatbots, sentiment analysis; Voice-enabled tech: smart home tech, automated interpreting

Resources to keep up with the tech trend: 

Blog: Slator, Multilingual Magazine

Conference: Women in Localization, SF Globalization, Loc World, IMUG, GALA, TAUS, ALC, Brand to Global, Unicode Conference

Podcast: Globally Speaking, Worldly Marketer

 

Exploring the Success of Translation Management Systems

In translation and localization industry, a new trend has emerged and led many LSPs to the next era, which is, the wide acceptance and application of Translation Management System (TMS). It can automate many parts of human efforts in a typical translation workflow, a great way to save time and cost.

 

TMS Features in General

There is a variety of cloud-based TMSs developed by different companies, and most of them share a lot of similarities. What are the standards to measure the success of a TMS? Below is a Matrix Graph that explained this question. It has two dimensions. One is time: Before, During, After (in a project). The other is the function: Linguistic, Technology, Communication, Analytics (different perspectives).

 

TMS Case Study: WorldServer

Among all of the TMSs implemented in the market, WorldServer can be the most Classic one. It is developed by a leading LSP, MediaLocate, and widely used in many vendor-side and client-side companies (such as eBay). Here is an infographic demonstrating how to build up a project in WorldServer (Not an easy process! ).

After experiencing the creation and conduction of a whole project, my suggestion is: Never jump into it too soon. You’d better draw a structure first and ensure you have included all roles and steps. Then follow its lead and create things step by step.

My team did a project for Clarins using WorldServer as our TMS. Click here to see our awesome group work:

Clarins Project

 

TMS integration with Content Management System 

With the popularization of Content Management Systems, many companies integrate their TMSs with CMS and achieved a smooth workflow:

Content Creation — Content Localization — Real-time Publishing 

This integration wisely avoided the linguistic preparation from engineers’ side and achieved the sync between content creation and translation. It is a great success for LSPs who compete for the go-to-market speed.

Example: Lingotek TMS (Click Here to see its features).

 

TMS integration with Crowdsourcing 

Crowdsourcing is another popular choice for many client-side companies and language service providers. It refers to the situation where a crowd going forward and collaboratively doing the translation work, often without direct financial motivations. It has been successfully implemented by many business and technology giants, such as Facebook, Twitter, Adobe or else. When the volume of translation is massive or the deadline is urgent, the power of crowd can ensure the high-efficiency. Also, it is a brilliant way of brand promotion (Take the brand to the cloud).

Example: Lingotek TMS (Click Here to see its features).

 

TMS Selection and Adoption 

Take all the possible features of a TMS into consideration, you might realize it is a  difficult decision for companies to choose their own TMS from the cloud. Here are some principles you may need to follow when select and adopt a TMS to meet your interests:

  • Analyze your needs: Different TMSs have different features. Some are linguistic-oriented and have more features on TMs, TDs, etc. Some are business-oriented, and it’s easy to use them to manage resources. How do you prioritize your needs?
  • Include all stakeholders: There are many stakeholders in a translation project. Make sure to communicate with all related departments. Including roles such as translators, reviewers, project managers into the system correctly.
  • Build or Buy-in: Consider the pros and cons of building a TMS system, and to buy one from the third party. Then make your choice.
  • Automation: Think through what parts of a project can be automated. Integrate all the automation possibilities to the system in order to facilitate the process.
  • Test run: To ensure the success of your TMS, give it a test run. Focus on the outcome quality of the translation, the efficiency and other concerns take into account.

When you are choosing your TMS based on the principles listed above, you may need to do a TMS comparison in mind. Here is a TMS comparison I did between Lingotek TMS and XTM Cloud. Based on my comparison, I chose a TMS for Alibaba.

Click to see the article: TMS Comparison

WorldServer Practice: Translation Project for Clarins

In Translation Management System course, our group completed a project from Clarins, a world-known cosmetic company. In the process, we accumulated our professional experience in client education and communication. We played around with WorldServer and understood its role in different phases of a translation project. Overall, it is an impressive journey for us. Many thanks to my colleagues Isabella Sun, Weixin Mo, Wanli Zhou, Muge Zhang.

Let me walk you through the whole process of our project. First, here is the basic information of our project for Clarins.

Project Description | Isabella 

  1. Client Info: Clarins Group, trading as Clarins, is a French luxury skincare, cosmetics, and perfume company, which manufactures products, usually through high-end department store counters and selected pharmacies.
  2. Project Overview: The content we are translating for Clarins this time is Clarins Skin Spa Brochure. The skin spa program by Clarins has been established in over 40 countries. They hope us to build Translation Memory and Termbase for them as well as translate the updated brochure. It contains 879 words (including words on pictures). Starting from April 6, our group finished this project within two weeks.
  3. Tools and Working Phases: To facilitate the whole process and make sure the deliveries’ quality. We used WorldServe to organize our work and generate quotes, and we used SDL Trados to build translation memories. Briefly, our work was divided into three phases: preparation, production, and finalization.

We drafted the business proposal for Clarins before the client meeting, click here to see our proposal: 

Proposal for Clarins 

We held our client meeting in April and summarized the reflections on it as followed:  

Lessons Learned from Client Meeting | Tianlin 

  • Client Education: Since our client is not from language industry, they have very limited knowledge of our typical workflow. Good communication needs to be set up from the beginning to avoid misunderstandings.
  • “Why should they care?”: Since our client is a for-profit company, they focus more on benefits instead of features. We should follow this logic and explain more about the benefits of our chosen resources and tools. They don’t need to know every step of our effort, but what consequences our effort can bring for them.
  • Contact Person: We should include a contact person to guide our client through the whole process, answering their questions or else.

After the client meeting, we delegated the tasks according to our proposal. Each group member did a wonderful job and met some challenges as well. 

Here are some screenshots of the localized file (major deliverable): 

 

Here are our deliverables, including the source file, the localized file, termbase and translation memory tmx file, and pseudo translation file: 

Deliverables

 

Here are the lessons learned in terms of different steps: 

Pseudo-Translate | Wanli

  1. Random Uppercases: Since the original source file is in PDF format, after converting it to a word document the text layout changed greatly, with upper cases showing up randomly in the text, which although didn’t affect Pseudo translate, would become a problem in translation.
  2. The Pseudo translation sees a lot of text bleeding.
  3. Prolonged text continued on a second page which doesn’t have corresponding pictures.

 

Termbase | Weixin

  1. Because the source file is relatively short, we can have more considerations for term base establishment. Most of the terms in our term base are nouns because nouns are usually the foundation for understanding sentences.
  2. We need to pay special attention to the encoding. The term base file needs to be saved as UTF 16LE so that it can be imported to World Server. And also, if we export the term base, the terms will become mojibakes if the encoding of the program that opens this file is not correct.

 

Translation Memory | Muge

The client didn’t provide us with any translation memory. But we offered to create one for them as an extra benefit. After we translated the source text into Chinese, we used SDL Trados Studio to align the segments and created a translation memory. The TM file is in tmx format.

 

Desktop Publishing | Tianlin

 

  1. Picture List:  The document from our client includes many ad images. While extracting texts from images, I found it’s important to include screenshots for translators’ reference, so that they can have a good grasp about the context and come up with appropriate translations. (Lingotek can do a better job in terms of linguist’s reference)  
  2. Style:  Because the document is for commercial purpose, the localized images need to be visually attractive to grab the audience’s attention. When English text is replaced by Chinese, certain typeface and type size need to be changed as well. Our team needs to inform clients about this change and get their approval first. Plus, consulting a professional advertisement graphic designer within the budget may be a good idea, in order to ensure the high quality of our service.

 

 

 

When the project is done and closed, we also had a post hoc meeting. 

Click here to see the slides

 

Notes: 

All mentioned material is created and written by all team members:

Isabella Sun, Weixin Mo, Wanli Zhou, Muge Zhang, Tianlin Li

 

DJI Terminology Management

Terminology management is an essential part of language service. No matter you are in the academia or in the workplace, having a unified term bank as an easy-to-reach reference can largely boost the quality and efficiency of your work. Terminology brings a positive influence to so many roles:

  • Experts: They could write subject-oriented documents using unified terms, which creates the same tone.
  • Translators: Terminology provides a knowledge base for their reference, facilitating the process of their translation.
  • Clients: Terminology can ensure the documentation of certain product is accurate, avoiding the risk of liability issues. It reduces the time and cost in certain ways.
  • End-users: Avoid their confusion when using products.

In my terminology management class, my group decided to provide terminology service for DJI, a leading drone company in China. We overcame obstacles in the management and technology field and created out a wonderful plan. Many thanks to my team members: Kayla Munoz, Isabella Sun, Brenda Hu.

 

Scenario 

Client representative: Department Liaison

“Our company is decent sized. We have a few hundred people spread across eight or nine departments. My role was created to liaise with software engineering, technical writing, and the parts department. None of them can get along. You see, each department is contributing content to big product launches worldwide. But they can’t agree on single terms to use for device components. The engineers are using abbreviated terms so that content meets character limitations of the GUIs. The folks over in parts have their own identifiers for those same components. Then, we have the technical writers who get so mad when they’re not consulted about the abbreviations or identifiers used.

These issues we have with inconsistencies are only compounded when we send content to translation. Terms don’t match across content products, and our customers are starting to get really confused. We need a way to link all of these terminology sets together…”

 

Case Analysis

Company Name: DJI

Company Website: dji.com

Basic Information: DJI (Da-Jiang Innovations Science and Technology Co., Ltd) is a Chinese technology company headquartered in Shenzhen, Guangdong. It’s known as a manufacturer of unmanned aerial vehicles (UAV), also known as drones for aerial photography and videography.

Organizational Needs: As a company that’s accounting for 85% of the global consumer drone market, DJI is decently sized and still expanding. The following problem is a lack of communication among different departments. The department liaison can be harmful to the company’s future development so DJI connected us and hoped we could find a solution.

Appraisal on Stakeholders’ Need: Our client has been fully aware of the major problem that exists in DJI. They pointed out that the inconsistencies in translation caused most of the issues. Based on this, we believe our client will be cooperative and willing to provide us with the necessary materials to build a terminology database.

Organizational Gaps and Inadequacies: The inconsistencies in translation content caused inefficiency in cross-departmental communication as well as consumers’ confusion.

 

Solution 

Based on the scenario and case analysis above, our team developed a terminology solution for DJI. Click here to see our pitch:

DJI Terminology Pitch

 

After presented our pitch to the class, we also got a chance to see the fantastic work from other teams, absorbing the wisdom from our peers as much as we can. Although we chose different scenarios and cases, we have a lot to learn from one another in terms of terminology service. Here is a list of their strategies collected by our professor Alaina Brantner. It’s definitely a good resource for a future career in terminology management:

Strategies from peers

 

Reference

The-benefits-of-Terminology-A-guide-for-language-lovers

Notes

All materials are written by our whole team:

Kayla Munoz, Isabella Sun, Brenda Hu, Tianlin Li

Statistic Machine Translation Training for SIPO

 

In the Advanced CAT class, we had a chance to try the second one: Statistical machine translation. We chose a field, built up a bilingual corpus from scratch and trained our own SMT engine. As a graduate student coming from a pure liberal art background, the whole process is sort of torturing, including data collecting and cleaning, system maintenance etc. However, I learned quite a lot from it. Few people except engineers in the translation industry can go under the theoretical surface of Machine Translation, swift his or her role from a “user” to a “developer” and understand its pros and cons from an internal perspective. Many thanks to my incredible team members, Rory Hou, Wei Wu, Melissa Zhang and Cheng Song.

 

 

  • Choosing field

A first headache we faced is: which subject field shall we choose? Firstly, we need to find a field with tons of existing bilingual text that is easy to extract. Secondly, the text from this field should tend to be “hard”, for instance, field-wise terminology, unified writing style, etc. Text like poetry, soap opera lines is definitely off our list. Eventually, we agreed to do the patent translation.

We chose the waterfall workflow for our project.

Here is our proposal for the pilot project: SMT proposal

After the kick-off meeting with our “client” Adam Wooten, we start our project. 

 

  • Data Collection, Cleaning, and Alignment

Then, we need to collect enough cleaned-up data to build our bilingual corpus. A well-functioned SMT engine needs an enormous corpus to be the underlying support. (eBay’s corpus for SMT training contains 1.5 million words). Due to time constraints, we could not reach the standard but we tried out best.

Data Source: We choose WIPO Patentscope, UN documents and PKU Law as our major source of bilingual data. Data Cleaning tools such as Okapi Olifant can be used to clean up untranslatable strings or other irrelevant parts.

 

 

Text Alignment: After we extract enough bilingual text from those two sources, we need to align the English text to the Chinese text. Due to the huge amount of data, many alignment tools we used to use such as YouAlign fail to work without a paid membership. Fortunately, we found a very handy tool: TMXmall, to help us align tons of documents and export them as tmx files in a short time.

 

Closure: The bilingual corpus is ready to go. We also cleaned up some monolingual text for later use.

 

  • SMT Engine Training Kick-off

After the data preparation was finished, we got into the Microsoft Translator Hub, and created a new project.

We imported data from the bilingual corpus to training, tuning and testing sections, and started the first round of training. When the training is finished, the system will send notifications to each of us, and we can see an auto-generated BLEU score to indicate the linguistic quality of our training. Then, we started to change some data and conducted the second round. The BLEU score indicates the quality of our data for each round, and we change our imported data in a new round based on its fluctuation. Our goal is to let BLEU score go up.

  • Quality Review 

After each round of training, we review the MT results and come up with a score based on our quality metrics (using MQM standards). We set the scale of the score from 0 to 1000 to measure translation quality.

 

We calculate the time and cost of our pilot project. Obviously, it’s different from our initial speculation. We updated our proposal based on those found in the pilot project.

 

Here is our Updated SMT proposal: Updated SMT proposal

Plus, we summarized all we learned from this project: Lessons Learned

 

GalaXync -Bringing worlds together

In Localization Project Management class, I had a chance to build up my first language service company from scratch, with my awesome team members: Helen Jung, Chris Healy, Muge Zhang, Jiyi Zeng and Isabella Sun. It’s definitely one of my best experience in MIIS. We named our baby — GalaXync.  We’re here to be your bridge across linguistic divides, connecting you to an audience of millions around the world.

                                                      ( Official Website )

We chose WordPress to build up our official website. Here goes our mission and vision page. 

Our Mission –

GalaXync is a multi-language vendor that specializes in three East Asian languages: Chinese, Japanese, and Korean. Our executive team firmly stands behind the idea that no matter where one is from, we can synchronize any form of content to be relevant and enjoyable to natives of other parts of the world. Composed of expert linguists and project managers with years of combined translation, localization, and consulting experience from a multitude of industries, GalaXync is the best partner to launch your company into today’s ever-competitive, ever-changing global market.

Our Vision – 

GalaXync strives to be the bridge connecting everyone in different parts of the galaxy and close the linguistic or cultural gaps between them. While our focus is on East Asia, we hope to continuously grow our services, customer base, and areas of expertise, becoming the world’s largest localization services provider with no language left behind. Contact us or visit us in our office, where our team is taking one step closer every day to our goal of bringing worlds together.

 

After rounds of discussion, we settled up a standardized translation and localization workflow goes as followed.

  • Preparation
  1. Inquiry: The first thing that happens in any project is that you reach out to us! One of our account managers will then meet with you to find out how we can help you best. What’s the scope of the project? What audience are we helping you reach? Do you have any budgetary or time constraints? With this information in hand, we’ll help your project go from impossible to possible and kick off a rewarding relationship for both of us.
  2. File analysis: The next thing we’ll need is the files you want to be translated. Our hardworking staff of engineers will analyze the files to create a word count and then prepare them for translation, which will make our translation efficient and keep your files safe.
  3. Team assembly: The project manager will determine the best way to meet your translation goals, assemble a crack-team of our best translators, and create a timeline. Other companies might wait until after the contract is signed to assemble a team, but we will never tell you we can take on a project before we know we have the exact right people to do it. With a full-professional, in-house group, we can assure the quality of our translations. Plus, if you know who’s working on your product, you can rest easy knowing it’s in good hands.
  4. Initiation: The engineers and the project manager send their details to the account manager, who uses this information to compile a quote ( Click here to see a quote sample of a given project: Quote Sample ). The account manager then sends this quote to you, including a detailed breakdown of the various costs that will depend on the nature of the project, its size, and its speed. If you like what you see, we’ll write up a contract and sign it with you.
  • Translation
  1. Getting the project to the team: Congratulations! The localization of your product is now officially underway! The project manager distributes the files for translation, a translation memory (a fancy tool that keeps us consistent and efficient), glossaries, style sheets, and any extra references you provide to us.
  2. Translation: Our team of translators works together to translate the text into the language(s) you need (be it Chinese, Japanese, Korean, or English) and deliver the completed translation back to the project manager right on time!
  3. Editing & proofreading: The project manager will get the translations to our editors, who will compare the translations with the original text to make sure that all of your content is retransmitted faithfully. Additionally, our proofreaders will conduct a second round of editing, this time to make sure that the translations are good quality, effective text in the target language.
  4. Re-assembling the files: Once everything is up to par, our engineers reverse the process they performed in the beginning, putting the new translations back into your full product.
  • Review & Delivery
  1. Quality assurance: Once the files are ready, we will make sure that the final product works, testing for bugs and triple-checking that the translations are effective in context. If you agree, we will invite third-party reviewers to review the final product. Of course, you will also give us your input!
  2. Delivery: Once everything is just the way you need it, we will deliver the freshly-localized product to you, ready to release to your international audience.
  3. Invoice: At this point, you will confirm receipt and we’ll send an invoice to you, detailing the rates and costs indicated in the contract, along with any changes that were made along the way.
  4. Feedback & evaluation: To continue improving and refining our localization services, we’ll request feedback from you, and debrief among ourselves to determine what was successful and what created problems. Hopefully, you enjoyed your experience with us enough that we have the pleasure of getting to do it all over again with you in the future!

We designed our file system to manage tons of files, click here to see how it looks like (Click here):  File System

To ensure the unrivaled quality of our service, we also worked out a vendor recruiting process, which includes three parts: Resource Evaluation, Language Evaluation, Onboarding and Welcoming (Click here): Recruiting Process

We used Basecamp to track our project and ensure good communication among one another. We used ProjectLibre to schedule our workflow as followed (Click here): Schedule

As a for-profit entity, it’s always vital to engage and convince your clients, making them buy your service. Here is a case pitch to show how we demonstrated ourselves to a client (Click here): GalaXync Pitch

 

The whole process is an engaging journey for me to walk through and understand how language industry works as a “professional” from a holistic viewpoint. And more importantly, how to be part of a team, effectively communicate with other team members, learn from others and collaborate with others. In the business world, many things need to be written down in a professional manner. I’m willing to go under the surface of project management and explore more.

 

Note: All mentioned documents are designed and written by our whole team: Helen Jung, Chris Healy, Muge Zhang, Jiyi Zeng, Tianlin Li and Isabella Sun

 

TMS Comparison: Lingotek, XTM Cloud and GlobalLink

 

In the TMS tools market, Lingotek, XTM Cloud, and GlobalLink stand out from the crowd and are widely adopted by many clients or LSPs. They share similarities regarding the functions and differentiate from one another regarding their unique features.

 

Lingotek

Compared with the old-fashioned interface of Worldserver, Lingotek looks much more modern and user-friendly. You can judge its usability solely by its appearance. The developer of Lingotek has precisely seized the chance to integrate new functions into this system, in order to suit emerging needs from the fast-growing industry.

  • Integration with CMS and Crowdsourcing: Lingotek is integrated with many popular CMSs such as WordPress, Drupal, and Joomla. It also integrated crowdsourcing solutions. Project managers can crowdsource or outsource projects to a community, reducing time and cost.
  • File Preparation: In the file preparation phase, Lingotek separates the content from websites or other products, removes all of the non-translatable parts, breaks the text into segments. Conventionally, this task needs to be assigned to an engineer, but now it is fully automated by the system. When the translation task is done, the translated text can be easily returned to the original product, with images, style, formatting intact.
  • Collaboration: On Lingotek workbench, linguists can work on the same document at the same time, and easily communicate with one another through real-time chat and segment-specific notes. This function is helpful in a scenario where more resources are needed due to a large quantity or an urgent deadline.
  • From Content Creation to Translation: In Lingotek, the users can not only manage translation asset but also the original content. Any changes made to the content will be sent as notifications to the translators so that they can follow up and edit the corresponding translation. When a project is completed, users can complete real-time publishing on the platform.
  • Actionable Analytics: If the users don’t know which language should they choose to translate in order to best expand the market, Lingotek can provide the statistics on web page views etc. for them to make a better decision.
  • Content Value Index: Lingotek developed this measurement to tailor different workflows for different content. Automatic: machine translation (MT) + translation memories (TM). Community: Activate a community to voluntarily translate. Professional: Professional translators.

User’s Notes – 

  1. In the project settings of Lingotek TMS, there is a button called “new” next to workflow and client name. When project managers could not find the correct workflow or client, this button can navigate them to create new ones, instead of dismissing the project creation and start again.
  2. Every account can have multiple roles in Lingotek TMS, unlike WorldServer. As a project manager, I can assign the translation to myself. This feature has its pros and cons. If a person wants to switch his or her role (for example, from translator to the reviewer), he or she does not need to log out and log in again, holding several accounts in hand. However, it might also cause chaos and confusion when PM is assigning tasks because all accounts are not designated in one role.
  3. In Lingotek TMS, there is a leaderboard to show the top translators and top teams. PMs can assign tasks to more capable human resources, and linguists can also be encouraged to provide a better translation. In WorldServer, the roles and teams are arranged in an alphabetical order, and it does not provide a similar function.
  4. In Lingotek TMS, there are two types of resources for reviewing, voting pool and moderators. The review tasks can be assigned to a community instead of a person or a team.

 

XTM Cloud

As another leading TMS in the market, XTM Cloud has its functions as follows.

 

  • Filter customization: Just like other popular TMSs, XTM has a lot of filters above the project list, which guides the users to find the target project efficiently. However, the “Save as Project Filter” function makes XTM stands out from the crowd. It allows users to customize new project filters based on their needs and save them for later use.
  • Instant Messenger: As we all know, communication is a vital part of successful project management, especially translation management. XTM takes it into consideration and integrates instant messenger into the system, so that project managers can chat with linguists, clients on the on-going process, and avoid any kinds of vagueness.
  • LQA: XTM integrated TAUS Dynamic Quality Framework as its Quality Assurance metrics. The QA reports from the reviewers will be used to rate translators.
  • The context for segments: In the notes of segments, images or the link to API context can be uploaded as a reference for translation. Thus, linguists can get enough background knowledge from the original content, and avoid the embarrassment of “lose the context” situations.
  • Mobile Application: XTM has a mobile application for users to track projects on phone. It’s very thoughtful.

User’s notes –

  1. While choosing the language combination for projects, users must change the languages of the client first, then add those languages to linguists. It will cause a certain degree of confusion.
  2. While changing the status of translated segments, there is no “confirm all” feature, and users must confirm one by one.
  3. There are too many features in XTM, and sometimes it can be a problem and cause confusion for users.

 

GlobalLink

GlobalLink is another powerful TMS adopted by the leading LSP, TransPerfect. It has its unique features as follows:

  • Client Customization: it has several modules, including Order Portal, Term Manager, TM Server, Review Portal, Trans studio, Machine Translation, and Report Portal. Clients can decide which modules can be combined and used for their products.
  • Integration with connectors: it has over 40 connectors including eCommerce, database, CMS and other products.
  • GlobalLink OneLink and Transport: GlobalLink OneLink is a branch of its workflow can do proxy localization, and GlobalLink Transport is another for simplified projects.
  • Security: Other than “account and password” requirement, it has XM decrypt function to eliminate the security concerns for cloud-based tools.

User’s notes –

  1. PMs can manage notifications based on needs, eliminating the unnecessary information.
  2. UI Design: different colors indicate different things (such as the process bar).
  3. Linguists have instructions, reference files, highlighted terms and live review function which is compatible with Office files, and Indesign files in the future.
  4. Reviewers can have two types of reports: segmentation report as a feedback to linguists, and a graph report for clients. It uses Tableau as its report tool, integrating the power of data into this system.
  5. The “Gate task type” setting can be used to set up “where to stop or jump through”(Do linguists for this particular project need a review?)
(Guest speakers from GlobalLink Translation.com came to MIIS and introduced their state-of-art technology. The fourth person from the right in the front row is me.)

 

Scenario 

China’s largest e-commerce company, Alibaba, want to bring in localization service and provide multilingual web pages of its major online shopping website Taobao, in order to expand its business globally. They decided to adopt a TMS to help them manage website localization projects.

 

Target Analysis – 

Company Size: Large

Organization of company: Client & For-profit

Typical project types: website page (with images)

Typical workflow: Agile (web content is not fixed)

 

Solution – 

Lingotek.

  1. The purpose of webpage translation is to let more online buyers know about the products. So the quality goal of content is “understandable”. Plus, the volume of tasks is super large considering its business scale. Alibaba can use Lingotek to crowdsource its projects, saving time and cost by reaching the “understandable” level of quality.
  2. Regarding the project type is website page, Lingotek can integrate the CMS base into its system, economizing time on transfer.
  3. Users can use its Actionable Analytics function to determine the target market, in order to maximize its profit.
  4. The agile workflow needs Lingotek to allow users to change content and corresponding translation simultaneously.

 

References

Lingotek

XTM Cloud

Expanding Reach through Crowdsourced Translation

Introduction 

Translation crowdsourcing is a popular choice for many client-side companies and language service providers. It refers to the situation where a crowd going forward and collaboratively doing the translation work, often without direct financial motivations. It has been successfully implemented by many business and technology giants, such as Facebook, Twitter, Adobe and so on.

Although translation crowdsourcing has reduced the cost of hiring linguists and translators, it requires the cost of advertising, management, and technical infrastructure. Why did many companies still choose it? Firstly, it is a brilliant way of brand promotion. Secondly, when the volume of translation is massive or the deadline is urgent, the power of crowd can ensure the high-efficiency. However, the biggest headache for crowdsourcing translation lies in its inaccuracy. Most volunteers are not professional translators, and they don’t have the responsibility to provide high-quality work due to its non-profit nature. Thus, how to maintain quantity and quality control of crowdsourced translation is significant to its success.

There are a bunch of popular tools can be implemented into translation crowdsourcing. In terms of technology, there are Lingotek TMS, Ackuna, Amara and etc. In terms of terminology, there are Tricider, TermWiki and etc. Each tool has its own pros and cons. The decision maker also can customize its own tool based the unique features of their projects.

How to maintain quantity control?

1.  Gamification: Add game-like elements to the whole system. Such as design and set up a rewarding system to volunteer translators, where they can win badges when they reach a certain amount of workload.

2. Machine Translation Application: Apply machine translation function to the translation process. When a project starts, the blanks can be automatically filled by machine translation results, and the translator only needs to revise it instead of translate from the stretch.

3. User-Friendly Interface: Design a user-friendly interface. When potential translators see the interface for the first time, they can quickly understand each process, and easily be attracted to start volunteering.

4. Positive Affirmation: Give affirmation and recognition to volunteer translators frequently, such as reminders of their contributions, virtual gifts and so on.

5. Process Reminders: Design and set up process reminders such as milestones to constantly remind translators of their workload.

6. Mobile Apps: Design a translation app so that translators can not only work on computers but also on phones or pads. Design off-line function so that they can download the project and translate it off-line. This can increase the potential working hours.

7. Subject Oriented: Classify the translation projects based on their subjects. Volunteer translators can choose the subjects they are most interested in before translation starts. The classification can reduce the potential subject barrier and maintain the passion of the translators.

8. Deadline Alarm: Set up deadline function for the translators. They can set the deadline for a certain project once it starts. The system can send deadline alarms automatically to the translators and remind them to finish their work beforehand.

How to maintain quality control?

1. Linguistic Preparation: Before the translation project starts, prepare relevant glossaries as volunteer translators’ reference and a style guide for them to follow.

2. Vetting System: According to the requirements of different projects, build up a vetting system to select qualified translators. The standards include their certification, educational background, prove of language skill, related professional experience and etc.

3. QA Automation: Design and set up a quality assurance auto-check function, and apply it to the translation process. When translators work on a project, the typos, grammatical mistakes and other issues can be automatically identified and corrected.

4. Segmentation: Design and set up a segmentation system, and logically segment the content of the project. the segmentation should obey two rules: 1. Don’t be too short and hurt translators’ understanding of context. 2. Don’t be too lengthy and demand excessive linguistic competence.

5. Shared TM System: Design a cloud-based translation memory which opens to all volunteering translators. For certain terms, the system can provide highly suggested translation results for the translators to choose. It can boost efficiency and accuracy at the same time.

6. Voting System: Design and set up a voting system for peer review. Translators from the same language pair can review each other’s translated segmentations and give votes to them. The segmentation that receives the most votes can stand out from other ones.

7. Hierarchy Establishment: Design several levels to differentiate volunteer translators according to the votes their translation received. The translators who scaled to a higher level can receive the bonus, such as certifications, free memberships and etc.

8. Professional Review: Hire professional reviewers from LSPs, academic institutes and etc to review the final projects. This step can be conducted after peer review.

9. Feedback: Design a feedback entrance that open to all volunteer translators. Anyone can provide suggestions to the community manager.

Final Project | Translation Crowdsourcing for Taobao 

By Jiyi Zeng & Tianlin Li 

Why Taobao needs translation crowdsourcing?

We noticed that your website is providing service to other countries and regions, yet it is only available in Simplified Chinese and Traditional Chinese. The Google Translate plugin could be enabled while browsing the site, but it seems that it is not working so well, especially when it comes to the text embedded in pictures, which may constitute the majority of the information on some pages.

We understand that the reason you are reluctant to add other languages might have been that the target customers you have in mind are just Chinese-speaking people in other countries, and they may contribute to a large share of the sales figures. However, in recent years, we have seen your business booming around the world, along with the increasing frustration experienced by non-Chinese speakers who tried to use the site. If you are looking to further expand your business internationally and at the same time are also concerned about the potentially high cost of localizing your site into other languages, we have a solution for you.

Introducing a hybrid model that combines machine translation, community translation, and professional review. We recommend using a different method to translate the different content on your website, according to the content’s business value. How do we determine the value of the content? We look at the visibility of it and the potential profit brought by it. For example, the information on the product pages posted by sellers is highly perishable; therefore it is not recommended that professional translators be employed to do the job. The reason is twofold. The time it takes to finish the translation and the high cost associated with it would make it not feasible.

How are we going to implement this model?

After thoroughly analyzing the website, we believe that there are two major components that vary in value. For each component, we have the different integrated method.

UI: This includes the homepage, help page, setting page, shopping cart, etc. Undoubtedly they have the highest visibility and potential for profits. In this case, we will invite volunteers to translate the content and have professional translators review, edit, and confirm the translation. In the workbench where the volunteers translate the website into context, translation memories, glossaries, and machine translation are also available for reference.

Product Pages: We are going to offer two options for the sellers that are looking to translate their product pages into other languages. They could choose to utilize Google Translate plugin to translate and publish their pages. Alternatively, they could pay Taobao a fee to crowdsource the translation. The fee does not go to volunteers; it is only for the cost of management and for using Taobao’s crowdsourcing tool. It is also optional for sellers whether they would want professional translators to review and confirm the translation at an extra fee. Sellers will post the work through the tool and it will either be claimed by volunteers or assigned by the system. After the work is done, the system will notify the seller that the content is ready to be published. To avoid untranslated texts embedded in pictures, for either option, sellers may only upload a text file to the system and the final deliverable will be in the same format. Sellers will have to complete the DTP work (if any) themselves.

Currently, since Taobao is offering its international service in Korea, Malaysia, Australia, Singapore, New Zealand, Canada, United States, and Japan, we propose that the target languages be English, Korean, Japanese, and Malay.

Resources | Tool

We plan to partner with Lingotek to build a customized crowdsourcing tool that integrates an in-context workbench and a TMS workbench that enables communication not only between volunteers and the localization team but also among different departments inside the localization team at Taobao.

Team and Roles

We will set up an independent localization team in your company. The localization project manager will deal with issues reported by regional managers and liaise with other departments. The regional manager will manage a local translator community. Linguistic and technical QA managers will maintain the translation quality through linguistic and technical assurance.

Financial Support

This localization team needs funds for the cost of management and daily maintenance. Once the project starts, Taobao may allocate some funds as the initial capital. Later, the fees paid by the sellers who need crowdsourcing will chip in. Once this project successfully expands to the global market and boosts the sales, the financial source can be generated from future profits.

Operational Process

1.     The localization team will start the first iteration of the localization project.

2.     The team will solicit feedback from volunteers, professional translators, managers, other departments and customers to find out potential problems and improve the localization process.

3.     After all the target languages are launched, the marketing department at Taobao will analyze changes in page views and sales volumes in different locales.

4.     The data analysis reports will help the leaders of the localization team with decision-making regarding the allocation of resources in the future.

Quantity and Quality Control Practices 

In this particular case, we worked out the best practices for quantity and quality control in the process of conducting this crowdsourced translation project for Taobao:

Quantity Control

Quality Control

Video Presentation

Taobao Pitch Video