The development of machine learning has changed many industries, including the language industry. We ask questions like “will machine translation replace human translators” from time to time. But what machine learning can bring to the language industry is more than a question of whether human will be replaced or not. This essay will examine the technologies that greatly impact the language industry, what is being changed, and how industry stakeholders can better prepare themselves for the future.
Machine Learning Technology Impacting the Language Industry
The main technology that has already exerted a profound impact on the language industry is the natural language processing (NLP) technology. NLP is a vibrant interdisciplinary field with the goal of getting computers to perform useful tasks involving human language, tasks like enabling human-machine communication, improving human-human communication, or simply doing useful processing of text or speech. For a long time, we’ve been seeing examples of these tasks like conversational agents, grammar checker, and machine translation (MT). Although MT has been developing for over 60 years, it was not until recently that human feels the real threat because of the great progress in machine learning with many evidences showing the near-human quality of neuro machine translation (NMT) output.
Besides NLP, machine learning can predict users’ preferences based on a large amount of data, enabling customized user experience, simpler and more automated workflow. All these contribute to the transformation of the language field by bringing in new players, more streamlined production process, larger user coverage, and smarter tools.
Changing Landscape: New Players, Streamlined Workflow, and Smarter Tools
The widely applied machine learning technology in the language field – machine translation (MT) – has penetrated into every corner of language service. Nowadays, not only language service providers are branding themselves as AI/MT solution provider, tech companies are leading the way of MT development. There are more MT vendors in the field: Google, Microsoft, Amazon, DeepL, Alibaba, Tencent, etc. MT engines are easier to train, users can customize their own MT engines using tools like AutoML Translation from Google or Microsoft Translator Hub. MT aggregators like Inten.to, translate4eu, are emerging at a rapid speed. And we also witnessed the born and growth of LPSs like Unbabel and Lilt, with MT as its DNA.
In addition to the high-level changes mentioned above, the rapid progressing technology also changed the workflow of translation/localization projects and the interaction between enterprises and language service providers (LSP). Traditionally, the localization workflow on the client side goes like this: the localization team receives resource files to be localized from engineering team and delivers the files to vendors through translation management systems (TMS), after the translation has been sent back, the localization project manager will check these files for quality control and then send them back to the repository for testing or build. With the improvement of machine translation engines, more and more products are going through a MT + post editing (PE) process rather than the traditional translation > editing > proofreading (TEP) process.
Apart from the changes in workflow, the scope of MT also grows. Companies like eBay, Adobe, and Microsoft have been relying on MT for making their products more accessible globally for a long time. Take eBay for example, in the past, raw MT was mainly applied for user-generated contents (UGC), like search queries, item titles and descriptions, product reviews and descriptions, or member-to-member communication (M2MT) tools. But now, eBay is employing MT for all eBay-created content such as help documentations and UI. Applying MT to all products has already become the localization strategy for more and more companies. Combined with data analysis, companies can localize their products in a smarter and scalable manner, for example they can launch web pages in many languages which are translated by machine and polish those with significantly growing page views. In this way, companies can greatly expand the coverage of target users without increasing much cost.
Machine learning technology also changes tools used in the language industry. On one hand, CAT tools or TMS tools are becoming more intelligent and personalized. Role-based personalization can result in pre-defined UI layouts aligned to the needs and tasks of given user groups. On the individual level, experience can be changed based on users themselves by utilizing collected customer data. On the other hand, we can see a growing trend of the integration of AI engines and these tools. Currently, most CAT tools are not pre-installed with connectors to cloud based AI engines. But we also witnessed the evolvement of new connectors acting as middleware to a bunch of machine learning engines.
New Roles of Stakeholders: Machine Learning for Efficiency, Humanity for Creativity
Like other technologies bringing both innovation and some disruptiveness, machine learning technology may squeeze out some narrow job titles while creating new roles like NLP algorithm specialist, MT solution architect, etc. For stakeholders who are already in the industry, we need to adapt ourselves to the transforming industry.
For linguists (translators and interpreters included), parts of our job have already been taken over by machines because of the progress of machine learning algorithms, text-to-speech (TTS) technology, automatic speech recognition (ASR) technology, etc. In some fields, like literature, transcreation, human translation still plays an important role. As a translator or interpreter, instead of worrying about being replaced by machines, we should embrace the technology and try to find ways to benefit from it. Also, it is advisable to learn skills like MT post editing, data analysis and the training of cloud-based AI engines.
For PMs, no matter we’re with the client side or the LSP side, the continuously progressing technology will alter our responsibilities, workflow, even co-worker team. For example, the workflow of MT projects will be different from traditional projects, depending on conditions like MT resources, language resources, and client needs. Before a project is launched, scoping the work comes first: How many words need to be translated? Usually, they come in much larger volumes, like millions of words, than traditional translation projects. Will DTP, MT training, human editing be included? Which MT engine to employ? What’s the evaluation metrics for QA? Can we simply quote by number of words as we are not sure about the quality of MT’s initial output?
The people to work with even changed when it comes to such projects. We may need a machine translation solution architect who has expertise in the process of an MT project and is familiar with MT tools, so he or she can pick out (or train) the best one for this specific project. We may even need to work with an NLP specialist or computational linguists for further developing our tools. Engineers may want to invest more time in connecting AI resources with CAT tools. Post-editors should be experienced linguists who are familiar with common error patterns of MT and know how to stick to the style guide.
Given the above possible changes, the role of a PM may evolve towards a resource management direction, he or she must make sure that each person in the team can access the needed resource at the right time and can provide the right solution to clients by combining resources in an effective manner. The PM should also have a relatively strong technical background, and communication, organization and coordination skills, as an effective project plan needs the cooperation of so many parties.
Developers’ tasks are also changing. By combining AI technologies with local resources like TM, terminology base, linguists, developers are able to build tools with automation plugins, MT pre- and post-processing features, customizable dashboards, etc. They must be aware of the state-of-art machine learning models being developed and know what are needed for their application and deployment.
For the manager, we need to realistically think about the tasks that will be disappearing over the next few years and start planning for more meaningful, more valuable work that should replace it. Some easy but repetitive tasks are actually a subtle encouragement for people to make narrow and boring job contributions. Machines do not get frustrated or annoyed, and they certainly don’t imagine, they’re more efficient in handling such tasks. But we, as human beings we feel pain, we get frustrated. And it’s when we’re most annoyed and most curious that we’re motivated to dig into a problem and create change. Our imagination is the birth place of our new products, new services and even new industries. So, why not bring more humanity to the language industry?
Machine learning has brought more possibilities and will definitely affect the language industry in many ways, however, it is important to note that, human efforts still play an important role in many scenarios and many locales where there’s a long way to go to educate people about language technology. Most importantly, humanity is what makes all these possible.
Recent Comments