A short introduction of Regex
Use of RegEX (https://regex101.com/)
Regex (regular expressions) is a perfect tool for translators to check their quality of translation. It helps translators to learn quality checking and fixing skills.
what is Regex? How are we going to use it? How to use Regex for QA checker (Quality Assurance) in Trados Studio? I will use four general examples in English to simplify Chinese to explain the general logic of Regex.
Let’s start!
Example one, Date




How is it supposed to work?
Although both American English and simplified Chinese’s date are separated by a forward slash, in American English, dates are formatted with the month/day/year, but in Chinese, the date order is completely different. Chinese people usually start with year/month/date. Please be careful if you are translated from other languages to Chinese, year needs to be first in both writing and daily speaking.
How to do it in Regex?
\d means digit and () means group. Since I have three groups of numbers here and both month and day are one or two-digit, I used (\d{1,2}) twice means to find any 1 or 2 digits number. (\d{4}) is to find any 4 digits number.
For Regex’s substitution page, I used $3/$1/$2, which means to put the 3rd group in the beginning, followed by the 1st group and 2nd group
Example two, Money Symbol




How is it supposed to work?
Both American English and simplified Chinese use a special symbol for money, in American English, the symbol of money is $ or dollars, but in Chinese, we use ¥or 圆. For example, in the pictures listed above, rather than $250.00, Chinese use ¥250.00 in writing like check writing, and we use 250圆for both speaking and writing. Please be careful if you are translated from other languages to Chinese, the symbol of money is different.
How to do it in Regex?
(\$) means the dollar sign, (\d+) means digits, and (\.) means the decimal place.
For Regex’s substitution page, I used $2圆, which means to put the 2nd group in the beginning, followed by 圆.
Example three, Name




How is it supposed to work?
As a native Chinese speaker, I studied English for more than ten years. One thing I found in American English and Chinese are different when they read and write names. In American English, we always call someone by their first name, then their last name. However, in Chinese, people prefer the last name first. One example I provided is my name. In China, people call me 高玮(Gao, Wei), but in America, people call me Wei Gao. Also, writing is different. We need to put a comma between last name and first name if we write the last name first, but we do not need to write a comma if start with the first name.
How to do it in Regex?
(\w+) means to find all word char. (\W) is to find any not word char.
For Regex’s substitution page, I used $3 $1, which means to put the 3rd group in the beginning, followed by the 1st group.
Example four, Specific Nouns




How is it supposed to work?
For some special words like father-in-law or mother-in-law, Chinese does not need to write hyphen-minus between words. In Chinse, we say 继父(father-in-law), 继母(mother-in-law) without hyphen-minus. Chinese does not need to use hyphen-minus between words.
How to do it in Regex?
(\w+) means to find all word char, (\-) means to find “-“.
For Regex’s substitution page, I used $1 $3 $5, which means to put the word char without “-“.
Here are my four examples. Feel free to leave any comments and suggestions.