Key-value pair is a commonly seen format to store phrases for translations. However, it is not enough for most of the cases especially for games that involve a lot of items and characters. Characters and items are commonly seen components in all genre of games. It is easy enough to develop the game in English with key-value pairs since there are only 2 different keys for an item i.e singular or plural. Things become complicated when it comes to European languages or Middle East languages, and the solution to that would be to use structural formats instead of key-value pairs.
Different forms for pluralization
There are many languages with different plural rules like English. For example:
- There are 6 plural forms in Arabic.
- There are 4 plural forms in Russian.
- There are no plural forms (or, only 1 single form for all items) in Chinese.
If the phrases are stored as simple key-value pairs, you'll need to have 6 entries for every item in order to make sure it works fine in all languages.
Key-value examples:
English
```
APPLE_ZERO = "%d apples";
APPLE_ONE = "%d apple";
APPLE_TWO = "%d apples";
APPLE_FEW = "%d apples";
APPLE_MANY = "%d apples";
APPLE_OTHER = "%d apples";
```
Russian
```
APPLE_ZERO = "%d ?????";
APPLE_ONE = "%d ??????";
APPLE_TWO = "%d ??????";
APPLE_FEW = "%d ??????";
APPLE_MANY = "%d ?????";
APPLE_OTHER = "%d ??????";
```
Chinese (Traditional)
```
APPLE_ZERO = "%d???";
APPLE_ONE = "%d???";
APPLE_TWO = "%d???";
APPLE_FEW = "%d???";
APPLE_MANY = "%d???";
APPLE_OTHER = "%d???";
```
You may check out the detailed plural rules of all languages at
unicode.org
Different forms for genders
There are similar problems for the cases of masculine/feminine/neutral in most of the European languages. English-speaking developers can easily miss the handling until the translation needs come.
In some of the European languages, there is a gender attribute for a noun, either masculine or feminine. This grammar rule does not apply only to human beings, but also objects.
There are no strict rules to determine the gender. They vary in different languages as well. In Spanish, "computer" (La computadora) is feminine. In German, "computer" (der Computer) is masculine.
The adjectives or the verbs following the noun should consider the gender of the noun.
For example, in Italian,
English:
```
GOOD_FEMININE = "%s is good."
GOOD_MASCULINE = "%s is good."
```
Italian:
```
GOOD_FEMININE = "%s ? buona ." (when %s is a female character)
GOOD_MASCULINE = "%s ? buono ." (when %s is a male character)
```
A game usually involves a lot of items and characters with different genders. In order to deliver high quality translation, a well-defined structure that supports gender is a must so that the translators can fill in the corresponding translations.
You may also need to have a set of rules to define the gender of the characters and the items in the game for different languages. It will be nice to have the structure available at the very beginning in order to avoid the problems later when the project becomes really big.
Conclusion
Use Structural Format Instead of Key-value Pairs.
A more structural format converted from gettext for Russian:
%d apple
%d ??????
%d apples
%d ??????
%d apples
%d ?????
This is the rule for it:
nplurals=3; plural=(n%10==1 && n%100!=11 ? 0 : n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2);
Instead of hardcoding the variations in the keys, rich data structures like XML helps. There is a well-defined format for localization which standardized the needs of localization: XLIFF (XML Localization Interchange File Format)
You can learn more in the
Wikipedia page of XLIFF.
There are many different frameworks for game development like Unity, Cocos2dx, Unreal Engine. Most of them support localization in a very good way, maybe with different formats. You may search for their documentations in order to learn more.
- Unity Localization Asset
- Unreal Engine Localization
- Cocos2dx Localization
You can also check out
our website for more information on localization.
We offer a translation tool free of charge to small teams. You can manage structural files easily with plural rules preset in the platform. We would love to have your feedbacks in order to further improve the tool.
The article starts with an excellent display of why a "Key = Value" solution is not well fit for localization and the subtleties of different major languages.
But then vaguely explains the solution to the problem and fails to address more complex cases, which are very common (like those where multiple amounts, custom names and genders may be placed in the same sentence).
XLIFF may be standardized but the signal to noise ratio is horrible, and no tools to translators were mentioned.
Then proceeds to proceeds to give links to paid localization solutions without still explaining how the problem can be solved efficiently.
This is why I reviewed the article as incomplete.