Localization Practicum – Automate 24 ways i18n method with Python

Some may say that among programming, localization-related and linguistic skills, you can’t have everything. Picking all up seems almost impossible for them. Yet, Zach, Will and I decided to put that theory to a test. We would like to integrate the knowledge we have around Python to automate some of the processes we learned about localizing websites and apps.

To automate the process, we first need to remind ourselves what 24 ways i18n is about. We’ll give you a brief intro and if you’re interested in learning more about 24 ways, click this link to check out the full tutorial on 24 ways.

About 24 ways i18n method

So about 24 ways i18n. It’s a way for you to insert translations into your website in a more dynamic manner. You might wonder: Why do we need this so-called 24ways to insert our translation? Isn’t it as simple as find the string and replace them with our translation? Unfortunately, that’s not the case.

Remember, different languages have different formats. For example, let’s say we’re managing a recipe website and there’s a recipe for a simple sandwich:

2 slices of toast
1 slice of cheese
10 slices of honey ham
Some butter
Some ketchup

In traditional Chinese, the recipe would be translated into:

吐司 2 片
起司 1 片
蜂蜜火腿 10 片
奶油少許
番茄醬少許

Notice something? Yes the format seems a bit different. Structures of the two languages aren’t exactly the same. That’s why instead of blindly dumping translations into our website, we need to make sure we’re dynamically updating the translation according to the actual language structure. And that’s where 24 ways i18n shine. By using 24 ways, you’re not only making LSPs’ lives easier but also preparing your business to advance to a global level.

What do we actually achieve with Python?

Other than the main function, you’ll need:

String.js (Storing the translation)
(prototype ready)
Re-formatting/replacing the string with the _() gettext method
e.g. “RandomText” –> _(“RandomText”)
(still in progress)

For the semester, Zach, Will and I decided to start things off by attempting automating the string extraction process, specifically pulling from Html/XML files. Here’s the prototype file we come up with, feel free to try it out along the way. In the file, you should find the “scratch.py” Python file we wrote and the “Sample.html” file we toyed around with.

Extract strings and store them in “String.js” file with BeautifulSoup package

BeautifulSoup is a powerful package for web developers these days to process website-related information. With BeautifulSoup, you can have Python reads HTML files, filters out tags as needed or finding strings within a specific type of tag.

Here’s a quick sneak peek of what our program looks like:

And here’s what the sample HTML looks like:

With the power of BeautifulSoup (Specifically, the “get_text()” and “prettify()” method), we successfully extract and store the string into our target – the “String.js” file. On top of that, we also provide our user with a “demofile1.txt” file to help them visualize what the file looks like without tags:

String.js
demofile1.txt

Along the way, we also realize that with BeautifulSoup comes a significant amount of possibilities. We’re merely scratching the surface here. Combining with packages like “urllib“, you’ll be able to pull info straight from the Internet!

Take a step further?

In our weekly discussions, we also have our focus on topics like execution of the program with bash scripts, and localization best practices such as duplicating source files in case of messing up. With the time and knowledge we have right now, we succeed in achieving some but not all. You can find out more about what we achieve in our Python program. They are stored as comments in the program since they don’t exactly contribute to our main goals – 24 ways or _() gettext method.

One thing worth mentioning is that we successfully incorporated some basic level of GUI(Graphical User Interface) features in our program. With the tkinter package, our users can run the program -> select the file via GUI -> get the same result! That sounds amazing, isn’t it?

Here’s the updated Python program.

After incorporating tkinter, we have GUI now!

What’s next?

It’s clear that we’re far from done, but as our graduation approaches, we have no choice but to wrap up what we have for now and perhaps pass on what we’ve achieved to our colleagues who might be interested in taking over our project next year.

Here are what we have achieved so far, open for improvement though:

Basic GUI (tkinter)
Strings extraction (BeautifulSoup)
“String.js” file prep
Duplicate files capability

And moving forward, some suggestions on what the goals could be for our future colleagues:

Re-format the strings with the _()gettext method using Regex (perhaps like this? ” _\(“(.*)”\)” )
Optimize the overall program, QA for potential bugs
Come up with an i18n guide to help streamline the string extraction process
(Some people build their website like building Frankenstein, thus makes it almost impossible to pull the strings properly)
See if we can integrate everything we have and fully automate the 24 ways i18n process

Overall speaking, I have a blast working on this project. Plus, my colleagues, Zach and Will, are highly competent in working with Python. We’re able to exchange ideas and feedback all the time. With ideas flying around, we’re able to come up with possible workarounds that we might have never thought of ourselves. Eventually, it leads to the fruitful results we have here.

Here we have a video presentation to help you visualize what we’ve achieved today:

Link to our video presentation

About 24 ways i18n method

What do we actually achieve with Python?

Extract strings and store them in “String.js” file with BeautifulSoup package

Take a step further?

What’s next?

Leave a Reply