Pre-Structure Phrases for Internationalization

Published October 07, 2014 by Loki Ng, posted by onesky
Do you see issues with this article? Let us know.
Advertisement
Key-value pair is a commonly seen format to store phrases for translations. However, it is not enough for most of the cases especially for games that involve a lot of items and characters. Characters and items are commonly seen components in all genre of games. It is easy enough to develop the game in English with key-value pairs since there are only 2 different keys for an item i.e singular or plural. Things become complicated when it comes to European languages or Middle East languages, and the solution to that would be to use structural formats instead of key-value pairs.

Different forms for pluralization

There are many languages with different plural rules like English. For example:
  • There are 6 plural forms in Arabic.
  • There are 4 plural forms in Russian.
  • There are no plural forms (or, only 1 single form for all items) in Chinese.
If the phrases are stored as simple key-value pairs, you'll need to have 6 entries for every item in order to make sure it works fine in all languages. Key-value examples: English ``` APPLE_ZERO = "%d apples"; APPLE_ONE = "%d apple"; APPLE_TWO = "%d apples"; APPLE_FEW = "%d apples"; APPLE_MANY = "%d apples"; APPLE_OTHER = "%d apples"; ``` Russian ``` APPLE_ZERO = "%d ?????"; APPLE_ONE = "%d ??????"; APPLE_TWO = "%d ??????"; APPLE_FEW = "%d ??????"; APPLE_MANY = "%d ?????"; APPLE_OTHER = "%d ??????"; ``` Chinese (Traditional) ``` APPLE_ZERO = "%d???"; APPLE_ONE = "%d???"; APPLE_TWO = "%d???"; APPLE_FEW = "%d???"; APPLE_MANY = "%d???"; APPLE_OTHER = "%d???"; ``` You may check out the detailed plural rules of all languages at unicode.org

Different forms for genders

There are similar problems for the cases of masculine/feminine/neutral in most of the European languages. English-speaking developers can easily miss the handling until the translation needs come. In some of the European languages, there is a gender attribute for a noun, either masculine or feminine. This grammar rule does not apply only to human beings, but also objects. There are no strict rules to determine the gender. They vary in different languages as well. In Spanish, "computer" (La computadora) is feminine. In German, "computer" (der Computer) is masculine. The adjectives or the verbs following the noun should consider the gender of the noun. For example, in Italian, English: ``` GOOD_FEMININE = "%s is good." GOOD_MASCULINE = "%s is good." ``` Italian: ``` GOOD_FEMININE = "%s ? buona ." (when %s is a female character) GOOD_MASCULINE = "%s ? buono ." (when %s is a male character) ``` A game usually involves a lot of items and characters with different genders. In order to deliver high quality translation, a well-defined structure that supports gender is a must so that the translators can fill in the corresponding translations. You may also need to have a set of rules to define the gender of the characters and the items in the game for different languages. It will be nice to have the structure available at the very beginning in order to avoid the problems later when the project becomes really big.

Conclusion

Use Structural Format Instead of Key-value Pairs. A more structural format converted from gettext for Russian: %d apple %d ?????? %d apples %d ?????? %d apples %d ????? This is the rule for it: nplurals=3; plural=(n%10==1 && n%100!=11 ? 0 : n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2); Instead of hardcoding the variations in the keys, rich data structures like XML helps. There is a well-defined format for localization which standardized the needs of localization: XLIFF (XML Localization Interchange File Format) You can learn more in the Wikipedia page of XLIFF. There are many different frameworks for game development like Unity, Cocos2dx, Unreal Engine. Most of them support localization in a very good way, maybe with different formats. You may search for their documentations in order to learn more.
  1. Unity Localization Asset
  2. Unreal Engine Localization
  3. Cocos2dx Localization
You can also check out our website for more information on localization. We offer a translation tool free of charge to small teams. You can manage structural files easily with plural rules preset in the platform. We would love to have your feedbacks in order to further improve the tool.
Cancel Save
0 Likes 11 Comments

Comments

Matias Goldberg

The article starts with an excellent display of why a "Key = Value" solution is not well fit for localization and the subtleties of different major languages.

But then vaguely explains the solution to the problem and fails to address more complex cases, which are very common (like those where multiple amounts, custom names and genders may be placed in the same sentence).

XLIFF may be standardized but the signal to noise ratio is horrible, and no tools to translators were mentioned.

Then proceeds to proceeds to give links to paid localization solutions without still explaining how the problem can be solved efficiently.

This is why I reviewed the article as incomplete.

October 07, 2014 04:43 AM
onesky

Thanks for the feedbacks. Will add more solution related information in the article.

October 07, 2014 05:29 AM
Zaoshi Kaba

Putting article itself aside, Russian plurals for apples aren't correct.:

0 ?????

1 ??????

2 ??????

...

5 ?????

...

etc.

October 07, 2014 11:16 AM
onesky

Thanks for letting us know Zaoshi! I've corrected the mistakes. I have included the form for 10.0 as "other" in the example.

October 07, 2014 05:45 PM
Endurion

Interesting article, good to point out some of the differences. The latter half is a bit short IMHO. It seems there would be different approaches whether you need to translate full text or assemble text from several pieces.

This however:

nplurals=3; plural=(n%10==1 && n%100!=11 ? 0 : n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2);

is Daily-WTF-material, that's obfuscation :)

October 08, 2014 06:53 PM
aureliengateau

For what it's worth, I put together a simple tool to be able to use Gettext in my Android game, as I wanted proper plural support. You can find it here: https://github.com/agateau/linguaj. The nice thing about Gettext is that there are plenty of tools to use it, so it's reasonably easy to get someone up to speed. A cross-platform tool like Poedit, for example, is simple to use even for someone who is not a developer.

October 10, 2014 09:37 AM
tnovelli

The problem with Gettext is, keys are English phrases. If you so much as add a comma, you have to search-and-replace everywhere that phrase is used. Inconsistent and error-prone.

Alternative #1: Use a simple key-value translation table. Non-techie translators can understand that just as well as PO files, and they don't need special tools, which are all overpriced garbage anyway. Translators HATE HATE HATE having to use $1000 crapware when Notepad would do just fine.

Alternative #2: Use vagueness and symbolism to sidestep proper pluralization/gender altogether... because it adds approximately nothing to gameplay and takes time+money that could be spent making the game enjoyable. It tends to come out amateurish anyway. Usually the English isn't even right, and it's safe to say the Chinese translation is inverse Chinglish. It takes a big budget to do it convincingly.

Ummm... end rant :)

October 12, 2014 06:08 AM
aureliengateau

I don't understand what you mean with "inconsistent and error-prone": it's up to the translators to update their translations once you give them the English text. Ah, maybe you mean you want to be able to update the English text without invalidating the translations? A workaround for this is to use "developer english" and "real english", but that's a bit cumbersome.

Regarding alternative #1: key-value translation tables do not handle plurals, and many tools which work on PO files are free software, they may be garbage, but they certainly aren't overpriced :)

I agree with alternative #2: especially since it makes it possible to build games for kids who can't read yet. It's not always possible nevertheless, for example describing achievements without text sounds difficult.

October 14, 2014 09:10 PM
onesky

The Daily-WTF rule is extracted from PO as an example of defining the logic for different languages. You can check out the rules from the unicode site I mentioned in the article as well.

And yes PO is another flexible enough format to work on the pluralization. The English key problem could be solved by using some standard keys instead.

msgid ""
msgstr ""
"Project-Id-Version: VERSION\n"
"POT-Creation-Date: 2014-10-14 22:07+0000\n"
"PO-Revision-Date: 2014-10-14 22:07+0000\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE TEAM <EMAIL@ADDRESS>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Plural-Forms: nplurals=2; plural=(n != 1);\n"
msgid "button.menu.about_us"
msgstr "About us"
October 14, 2014 10:14 PM
tnovelli

Ok, @Aurélien and @onesky, "developer english" / "standard keys" seems like the best practice if you're using PO/gettext/etc. Still, if I were writing code like renderText(__("button.menu.about_us")) then I could just as easily write renderText(TR.button.menu.about_us) using the programming language's regular facilities. No additional tools or file formats needed - a key consideration for solo developers and small teams.

October 18, 2014 04:52 PM
Selenas

Hi, guys. To any of you who is in need of a good collaborative translation tool that supports plurals, I warmly recommend the localization management platform https://poeditor.com/

It's reliable and the support team is very swift.

March 06, 2015 04:00 PM
You must log in to join the conversation.
Don't have a GameDev.net account? Sign up!

Key-value pair is a commonly seen format to store phrases for translations. Things become complicated when it comes to European languages or Middle East languages. And the solution to that would be to use structural formats instead of key-value pair.

Advertisement
Advertisement

Other Tutorials by onesky

onesky has not posted any other tutorials. Encourage them to write more!
Advertisement