Introduction
In this semester, we explored different website development languages and platforms and ways of localizing the websites. For our group final project, we decided to localize a JavaScript game, to employ 24 ways library to localize the website, and to come up with automation solutions when using 24 ways.
The game we chose is blackjack — a card game that compares who gets the biggest value between player and dealer. This JavaScript game is mainly written in ES5, but uses ES6 to display strings in the game. ES6 has a totally different way to display strings comparing with ES5 standard. Because of this, the process of wrapping up strings becomes totally automated.
In terms of localization, we used 24 ways library to achieve multilingual functionality. We localized the program in Chinese, and designed a language button to switch between English and Chinese.
In the workflow of 24 ways, we managed to automate the preparation steps. In this blog, I will introduce how did we do it in details.
24 Ways Automation
String Preparation
24 ways uses “_()” to wrap up strings that were prepared to be translated. If the program is written in ES5, we need to manually break down strings and variables to wrap up the strings that require translation. However, this program applied ES6 standard to display strings, so we could make advantage of it to automate the wrapping up strings step. The difference between ES5 and ES6 is shown in the following picture.
ES6 uses “`” sign to wrap up strings and “${}” to mark variables comparing ES5’s syntax (ES5 uses double or single quotation marks to wrap up stings; no symbols to mark a variable). It’s completely different from ES5 syntax. With ES5 standard we cannot wrap up strings by applying regular expression or “find and replace” function of text editors, since there are many other places to use double or single quotation marks, such as id and attributes.
Therefore, we can easily find and replace both “`” and “${}” with 24 ways’ wrapping strings method, and we don’t need to worry target unnecessary quotation marks. The final result is as follows.
String Translation
After preparing the strings, we were moving to the string translation phase. We found that 24 ways applies the exact same string wrapping pattern (“_()”) as Python localization library – pygettext. Therefore, we tried to employ pygettext to extract the strings that need to be translated. You can find pygettext script in the python folder. The only thing that needs to be noticed is that one must make sure the file that contains strings is in the same folder as pygettext’s. After this step, one should open command prompt and type a command line (see the picture below). The pygettext library will then generate a pot file that can be converted into a po file using Poedit. Po files can be translated in most of the CAT tools.
In a po file, there will be source texts and target texts. The target texts will be empty initially. At this stage, we came up with a further automation solution. 24 ways internationalization method requires preparing string files for each languages with the same format (see the picture below).
On the left side it’s the po flie; on the right side it’s the 24 ways’ strings file. It occurred to us that we may apply regular expression and “find and replace” function in a text editor to get strings files out of po files.
First of all, one should check the regular expression box to enable the method of applying regular expression in the “find and replace” function. And then we can remove unnecessary parts by employing regular expression or simply string literals (“.” means all characters and symbols; “^” the start of a search pattern; “$” the end of a search pattern; “*” means 0 or more same elements; “\s” means a space; “\n” means a line break).
The above picture is an example of the string literal way to target information needed. All “msgid” were replaced by tabs.
In order to get a 24 ways strings file format, one need to replace line breaks and all the “msgstr” with colons.
The last step is to replace line breaks with a comma. The following picture shows the final result.