The biggest mistakes made by game developers (from a localization perspective)
Hi all,
I'm a full-time game translator who has been working 11 years in the localization industry.
Tonight I wrote a new article, about the biggest mistakes made by game developers during the localization process (seen from my own subjective perspective, of course).
Hopefully it should warn you for possible pitfalls and result in a better localization of your product. Additions, suggestions and feedback are of course welcomed!
I will keep following this thread and answer any questions you may have.
Click here for the article
[Edited by - Yomar on August 2, 2006 3:06:24 AM]
Don't localize, Loekalize!
hmm, that's not a new article, is it? I'm sure I've read it before.
Anyway, just a bit of nitpicking:
First, is this a "fact"? Any sources to back it up?
And second, that is, with your own words, an average translater. An above-average translater might be able to work faster, and he could then charge a lower rate per word, which sorta ruins your argument a bit.
Might want to work a bit on that, if you want it to look objective, as "advice to game developers about localization", rather than "Hire me, cos my rates are high, and so by my own logic, I must be worth it" [wink]
So the advice is what, exactly?
What if I only have 3 days until the localized manual has to be ready?
There is however another good reason to do the above.
First, less duplication reduces the risk of errors.
You only have to spell Brian or gun once, so the risk that Brian becomes "Brain" in one of the sentences is smaller. This might happen either in the original text written by the developers, or in one of the translations.
Second, you just went on about how slow a process localization is, so cutting down on the amount of material to translate obviously has its attractions. [wink]
That said, you do make a good point. [grin]
Oh, and "consult your translators in advance" probably isn't an option, given that when you develop the game, you might not know who your translators are yet, or even which languages the game will have to be translated to.
Except in both the examples you gave, 50% wouldn't be enough anyway.
Uh... 11, eh? I thought this was a top 10? [lol]
Anyway, I'm just being picky. [smile]
Anyway, just a bit of nitpicking:
Quote:
Fact is that the average translator can only translate 2000 words per day.
First, is this a "fact"? Any sources to back it up?
And second, that is, with your own words, an average translater. An above-average translater might be able to work faster, and he could then charge a lower rate per word, which sorta ruins your argument a bit.
Might want to work a bit on that, if you want it to look objective, as "advice to game developers about localization", rather than "Hire me, cos my rates are high, and so by my own logic, I must be worth it" [wink]
Quote: 2. Expecting a 100.000 word manual to be translated within 3 days
So the advice is what, exactly?
What if I only have 3 days until the localized manual has to be ready?
Quote:
Brian %s1% the gun
takes
drops
There is however another good reason to do the above.
First, less duplication reduces the risk of errors.
You only have to spell Brian or gun once, so the risk that Brian becomes "Brain" in one of the sentences is smaller. This might happen either in the original text written by the developers, or in one of the translations.
Second, you just went on about how slow a process localization is, so cutting down on the amount of material to translate obviously has its attractions. [wink]
That said, you do make a good point. [grin]
Oh, and "consult your translators in advance" probably isn't an option, given that when you develop the game, you might not know who your translators are yet, or even which languages the game will have to be translated to.
Quote:
Be realistic and allow at least 50% more space for your translations
Except in both the examples you gave, 50% wouldn't be enough anyway.
Quote:
11. Handling punctuation yourself
Uh... 11, eh? I thought this was a top 10? [lol]
Anyway, I'm just being picky. [smile]
I didn't see my pet peeve - "embedding the text directly into the code, rather than creating separate text file(s)." That's the one I've encountered most often (even just last year, with a Polish developer, after I told them specifically not to do that.
-- Tom Sloper -- sloperama.com
Spoonbender:
Lovely, I always like good feedback, especially when critical! I'm not here to be praised into heaven. I'm here to improve my website, and only critical feedback will help with that.
I'm sure you have encountered some of the tips given before ("using the El Cheapo agency" tends to come back once every so often), but hopefully not every single one of them.
a) Why is 2000 words the average translation speed?
Good one. It doesn't need to be true just because I say so. This link should be a bit more convincing.
Of course translators who work faster can charge lower rates, but at least this shows that as soon speed rises above 2000 words per day, you should be vigilant and start asking questions.
b) What if I only have 3 days until the localized manual has to be ready?
Another good point - you want a solution for your problem. First, you can prevent situations like these by planning things in advance, or by feeding translators pieces of text as soon as they are ready. Even if some of these pieces change later on, CAT tools should be able to tell translators exactly what changed where, so that you'll never pay twice for the very same sentence.
c) Second, you just went on about how slow a process localization is, so cutting down on the amount of material to translate obviously has its attractions.
A bit far-fetched, as the number of words saved using this method is mostly less than a hundred (hard to prove, except based on experience). On top of that, the translator needs to speed down when translating these kind of segments anyway, trying to think of a solution that works in his or her language. But, I will address it :)
d) Oh, and "consult your translators in advance" probably isn't an option, given that when you develop the game, you might not know who your translators are yet, or even which languages the game will have to be translated to.
True, though developers who regularly localize games will have built up a team of translators already they can consult in situations like these (no one is going to charge for such simple advice). It can never hurt to know whether certain strings might cause problems in language A, even though it's not sure yet whether the game will indeed be localized to said language. Preventing problems is better than trying to solve them :)
e) Except in both the examples you gave, 50% wouldn't be enough anyway.
True, I'll add something in the veins of: "and even that is often not enough."
f) Uh... 11, eh? I thought this was a top 10?
I've already changed the title as I got some more ideas :)
===
tsloper:
embedding the text directly into the code, rather than creating separate text file(s)."
Fantastic, that one definitely needs to be added! I loved your "Stupid Wannabe Tricks" by the way :D
I'm going to work on the above points right away. More feedback is of course welcome!
[Edited by - Yomar on August 2, 2006 2:44:47 AM]
Lovely, I always like good feedback, especially when critical! I'm not here to be praised into heaven. I'm here to improve my website, and only critical feedback will help with that.
I'm sure you have encountered some of the tips given before ("using the El Cheapo agency" tends to come back once every so often), but hopefully not every single one of them.
a) Why is 2000 words the average translation speed?
Good one. It doesn't need to be true just because I say so. This link should be a bit more convincing.
Of course translators who work faster can charge lower rates, but at least this shows that as soon speed rises above 2000 words per day, you should be vigilant and start asking questions.
b) What if I only have 3 days until the localized manual has to be ready?
Another good point - you want a solution for your problem. First, you can prevent situations like these by planning things in advance, or by feeding translators pieces of text as soon as they are ready. Even if some of these pieces change later on, CAT tools should be able to tell translators exactly what changed where, so that you'll never pay twice for the very same sentence.
c) Second, you just went on about how slow a process localization is, so cutting down on the amount of material to translate obviously has its attractions.
A bit far-fetched, as the number of words saved using this method is mostly less than a hundred (hard to prove, except based on experience). On top of that, the translator needs to speed down when translating these kind of segments anyway, trying to think of a solution that works in his or her language. But, I will address it :)
d) Oh, and "consult your translators in advance" probably isn't an option, given that when you develop the game, you might not know who your translators are yet, or even which languages the game will have to be translated to.
True, though developers who regularly localize games will have built up a team of translators already they can consult in situations like these (no one is going to charge for such simple advice). It can never hurt to know whether certain strings might cause problems in language A, even though it's not sure yet whether the game will indeed be localized to said language. Preventing problems is better than trying to solve them :)
e) Except in both the examples you gave, 50% wouldn't be enough anyway.
True, I'll add something in the veins of: "and even that is often not enough."
f) Uh... 11, eh? I thought this was a top 10?
I've already changed the title as I got some more ideas :)
===
tsloper:
embedding the text directly into the code, rather than creating separate text file(s)."
Fantastic, that one definitely needs to be added! I loved your "Stupid Wannabe Tricks" by the way :D
I'm going to work on the above points right away. More feedback is of course welcome!
[Edited by - Yomar on August 2, 2006 2:44:47 AM]
Don't localize, Loekalize!
I believe the main problem here is, that noone understand what 2000 words are. They all believe this scenario: you have a document that needs to be translated and then you go into word and ask the word count. That is wrong! What the author of this post ment was 2000 different words. Like this:
I saw a big brown bear. I had never seen a brown bear before.
Brown bear is repeating and is not charged twice. Thats the clue. Your document might have twice as many words but only unique words are charged.
I saw a big brown bear. I had never seen a brown bear before.
Brown bear is repeating and is not charged twice. Thats the clue. Your document might have twice as many words but only unique words are charged.
Hi Samurai Jack,
Highly theoretical story to follow - skip this part if you're not interested in the inner workings of CAT tools
Oh, if language only worked that way! No, unfortunately that is not how it works. 2000 words are indeed just that: what Word gives you when you press Extra > Word Count.
The discounts made possible by CAT tools apply to sentences that are exactly the same. I.e.
This is a sentence.
This is a sentence.
This is another sentence.
There are many ways to charge people, but my way works like this:
First sentence: 4 words x normal rate
Second sentence: 4 words (repetition of previous sentence) x 25% of normal rate
Third sentence: 4 words x normal rate
The reason for not giving a discount on the third sentence is that changing one simple word in English can have enormous consequences for the target language, sometimes even resulting in a total rewrite. The example already given in the article shows this:
Brian takes the gun - Brian pakt het pistool
Brian drops the gun - Brian laat het pistool vallen
The reason why I still charge 25% of the normal rate for exactly the same sentence is that you still need to check whether certain lines fit in different contexts. A heading called "Space", the first time referring to your space bar (Dutch: spatie) and the second time referring to the name of the first level in your game, which happens to be set in space (Dutch: de ruimte), is one example of that. If a client does not want me to check repetitions, he gets them for free, but the results are often disastrous and I can only advise against this.
There are people out there who also give a discount on "similar" sentences (in which case the third sentence above would be considered as a so-called 75% Fuzzy Match, resulting in a 25% discount), but this is a practice I strongly disagree with. Changing 25% of the source text does not automatically lead to a 25% change in the target text. In the example with Brian's gun, a 25% change in the source text results in one new word and one extra word in the target text, which is far more than 25%.
But CAT tools are really complicated stuff, and even the above examples have been greatly simplified. Lately agencies even demand certifications to prove that you're actually able to work with this kind of software.
The unique words you mentioned only come into play in terminology management, which is really a different beast. You want translators to translate the word bear as bear consistently, not sometimes as black bear and the next time as grizzley. That's an aspect however that has nothing to do with rates - it's merely a technical problem that is completely dealt with on the translator's side.
Boring part finished - read on
This whole story is highly theoretical and something average developers should not worry too much about. What normally happens is that developers give me a text, I analyze it and come up with a price. They then agree or not. What you should realize however is that if the same sentences or even paragraphs keep popping up in your text again and again, you should get a discount... and that there is software out there which can see which sentences repeat, which sentences are new and which sentences are not new, so that you don't need to count all repetitions and new sentences manually (which is what developers still do in the vast majority of the cases - absorbing huge amounts of time).
The basic building block of repetitions, sentences, pairs or whatever you call them however are whole segments of text, separated by . ! ? : ; (not ,), hard enters or code. This is the smallest building block of a translation. Go lower and you enter the world of Machine Translation, which is still in its infancy and delivers very unreliable results (if you have ever used Babelfish you know why). The word "sentence" comes closest, I think. The official term for these building blocks is "pairs", but that's really translator's jargon which means nothing to developers:
Pair 1: This whole story is highly theoretical and something average developers should not worry too much about.
Pair 2: What normally happens is that developers give me a text, I analyze it and come up with a price.
Pair 3: They then agree or not.
Pair 4: What you should realize however is that if the same sentences or even paragraphs keep popping up in your text again and again, you should get a discount...
Pair 5: and that there is software out there which can see which sentences repeat, which sentences are new and which sentences are not new, so that you don't need to count all repetitions and new sentences manually (which is what developers still do in the vast majority of the cases - absorbing huge amounts of time).
[Edited by - Yomar on August 2, 2006 5:50:57 AM]
Highly theoretical story to follow - skip this part if you're not interested in the inner workings of CAT tools
Oh, if language only worked that way! No, unfortunately that is not how it works. 2000 words are indeed just that: what Word gives you when you press Extra > Word Count.
The discounts made possible by CAT tools apply to sentences that are exactly the same. I.e.
This is a sentence.
This is a sentence.
This is another sentence.
There are many ways to charge people, but my way works like this:
First sentence: 4 words x normal rate
Second sentence: 4 words (repetition of previous sentence) x 25% of normal rate
Third sentence: 4 words x normal rate
The reason for not giving a discount on the third sentence is that changing one simple word in English can have enormous consequences for the target language, sometimes even resulting in a total rewrite. The example already given in the article shows this:
Brian takes the gun - Brian pakt het pistool
Brian drops the gun - Brian laat het pistool vallen
The reason why I still charge 25% of the normal rate for exactly the same sentence is that you still need to check whether certain lines fit in different contexts. A heading called "Space", the first time referring to your space bar (Dutch: spatie) and the second time referring to the name of the first level in your game, which happens to be set in space (Dutch: de ruimte), is one example of that. If a client does not want me to check repetitions, he gets them for free, but the results are often disastrous and I can only advise against this.
There are people out there who also give a discount on "similar" sentences (in which case the third sentence above would be considered as a so-called 75% Fuzzy Match, resulting in a 25% discount), but this is a practice I strongly disagree with. Changing 25% of the source text does not automatically lead to a 25% change in the target text. In the example with Brian's gun, a 25% change in the source text results in one new word and one extra word in the target text, which is far more than 25%.
But CAT tools are really complicated stuff, and even the above examples have been greatly simplified. Lately agencies even demand certifications to prove that you're actually able to work with this kind of software.
The unique words you mentioned only come into play in terminology management, which is really a different beast. You want translators to translate the word bear as bear consistently, not sometimes as black bear and the next time as grizzley. That's an aspect however that has nothing to do with rates - it's merely a technical problem that is completely dealt with on the translator's side.
Boring part finished - read on
This whole story is highly theoretical and something average developers should not worry too much about. What normally happens is that developers give me a text, I analyze it and come up with a price. They then agree or not. What you should realize however is that if the same sentences or even paragraphs keep popping up in your text again and again, you should get a discount... and that there is software out there which can see which sentences repeat, which sentences are new and which sentences are not new, so that you don't need to count all repetitions and new sentences manually (which is what developers still do in the vast majority of the cases - absorbing huge amounts of time).
The basic building block of repetitions, sentences, pairs or whatever you call them however are whole segments of text, separated by . ! ? : ; (not ,), hard enters or code. This is the smallest building block of a translation. Go lower and you enter the world of Machine Translation, which is still in its infancy and delivers very unreliable results (if you have ever used Babelfish you know why). The word "sentence" comes closest, I think. The official term for these building blocks is "pairs", but that's really translator's jargon which means nothing to developers:
Pair 1: This whole story is highly theoretical and something average developers should not worry too much about.
Pair 2: What normally happens is that developers give me a text, I analyze it and come up with a price.
Pair 3: They then agree or not.
Pair 4: What you should realize however is that if the same sentences or even paragraphs keep popping up in your text again and again, you should get a discount...
Pair 5: and that there is software out there which can see which sentences repeat, which sentences are new and which sentences are not new, so that you don't need to count all repetitions and new sentences manually (which is what developers still do in the vast majority of the cases - absorbing huge amounts of time).
[Edited by - Yomar on August 2, 2006 5:50:57 AM]
Don't localize, Loekalize!
That was a nice article indeed altough it might be outdated. As a game developer I always negotiate to take away the pain of localization to the publisher. It's easier that way and you can actually do that the way publishers rip your money off and most of them won't mind infact sometimes they insist to let them do it. I don't wanna know what "hindernissenparcous" (19 chars word ???) means anyway and i think most game developers act like I do.
Being a game developer is very painful as oppose to the general class of software developers when it comes to localization. Why ?
1. because we do not have the luxury of fonts library, GUI libs, etc made for us. Arghh.. don't even get me started on those non-latin/Unicode char set languages :)
2. we're not rich. You have to understand that sometimes when a developer trying to get the cheapest translator available it's not because we're cheap bastard. It's because we usually don't make alot of money by distributing to a foreign country. Sometimes we just want to have our game played in different part of the world. You'd be suprise on how little we get when the publisher simply doesnt want to take any risk. Of course I'm not saying about big boys here (big boys don't have this problem cause they can pay someone to do it well)
If there's an advice : Make sure you all your text in the game not in the usual text file. Instead, use XML !! They are the most friendly text file available with open format and character sets and there are plenty of XML editor, parser, libs that can make your life easier.
ps: that word i mentioned above is taken from a benelux magazine that reviewed my game. It sounds like a latin name for a species :D
Being a game developer is very painful as oppose to the general class of software developers when it comes to localization. Why ?
1. because we do not have the luxury of fonts library, GUI libs, etc made for us. Arghh.. don't even get me started on those non-latin/Unicode char set languages :)
2. we're not rich. You have to understand that sometimes when a developer trying to get the cheapest translator available it's not because we're cheap bastard. It's because we usually don't make alot of money by distributing to a foreign country. Sometimes we just want to have our game played in different part of the world. You'd be suprise on how little we get when the publisher simply doesnt want to take any risk. Of course I'm not saying about big boys here (big boys don't have this problem cause they can pay someone to do it well)
If there's an advice : Make sure you all your text in the game not in the usual text file. Instead, use XML !! They are the most friendly text file available with open format and character sets and there are plenty of XML editor, parser, libs that can make your life easier.
ps: that word i mentioned above is taken from a benelux magazine that reviewed my game. It sounds like a latin name for a species :D
Ride the thrill of your life
Play Motorama
Play Motorama
Quote: Try to imagine what happens if 25 people start writing your manual simultaneously. No management tool is going to help you with this. The result would be a disaster, with 25 different writing styles. The same applies to translation.
Well, each translator could take care of 100.000/25 = 4000 words... then it would take 2 days... and the other day for proofreading
I'm gonna go ahead and throw my own rants in here as I've dealt with translation a bit : )
One of the comments was about:
Brian %s1% the gun
takes
drops
There is no justifiable reason to keep your code like this. Sure, you reduce the size (whoop d doo) and minimally reduce possible errors, but you don't seem to be considering the format of other languages, and that's kind of the point.
1) Not all languages will translater 'Brian takes the gun' in the same way as 'Brian drops the gun'.
2) Not all languages will keep these things in order, as you could get 'noun the verb noun' or 'noun noun verb' or whatever combination. Also, you might know your target language and how the format/ordering changes (english to french is fairly consistent), but you may not be sure you'll never add another language'.
Basically, you should never have sub strings in your localized text unless they're separate thoughts/sentences, such as "Some event: %s1%". Even this is really only acceptable some of the time...
As for tips, consider automated translation builds, especially for the UI where fonts and content are changed, so that you can get a rough indication or size and spacing. I know the 50% comment touched on this, but in my experience, this wastes more time (from a developer standpoint) than most.
Also, to reiterate and add to tsloper's pet peeve. All your localized content should be in another file (a resource DLL is common in windows), but keep in mind, you should not have things like ", " in your file that you'll be using to combine other entries. Like "Dog", "Cat", "Turtle" then programatically appending them with the ", " text. Find ways to avoid this as it's going to be problematic ; )
/rant
-Alamar
One of the comments was about:
Brian %s1% the gun
takes
drops
There is no justifiable reason to keep your code like this. Sure, you reduce the size (whoop d doo) and minimally reduce possible errors, but you don't seem to be considering the format of other languages, and that's kind of the point.
1) Not all languages will translater 'Brian takes the gun' in the same way as 'Brian drops the gun'.
2) Not all languages will keep these things in order, as you could get 'noun the verb noun' or 'noun noun verb' or whatever combination. Also, you might know your target language and how the format/ordering changes (english to french is fairly consistent), but you may not be sure you'll never add another language'.
Basically, you should never have sub strings in your localized text unless they're separate thoughts/sentences, such as "Some event: %s1%". Even this is really only acceptable some of the time...
As for tips, consider automated translation builds, especially for the UI where fonts and content are changed, so that you can get a rough indication or size and spacing. I know the 50% comment touched on this, but in my experience, this wastes more time (from a developer standpoint) than most.
Also, to reiterate and add to tsloper's pet peeve. All your localized content should be in another file (a resource DLL is common in windows), but keep in mind, you should not have things like ", " in your file that you'll be using to combine other entries. Like "Dog", "Cat", "Turtle" then programatically appending them with the ", " text. Find ways to avoid this as it's going to be problematic ; )
/rant
-Alamar
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement