2014-02-22, 20:22:56
So here's some quick guidelines to keep in mind when making a piece of software.
1. Every string must be meaningful on its own
This necessitates using placeholders. For example, having separate strings for "Page" and "of", is very bad. On the other hand, having one string of "Page {$pagenum} of {$totalpagenum}" is very good.
2. Never force word order
This means the placeholders must be named and can be freely switched around. For example, this is very bad: having "%s by %s, %s" where the three string variables correspond to $action ("Updated" or "Created" or "Posted"), $author, $timeago ("just now" or "yesterday" or "last week").
What should be done is to have 3 strings like this (translation provided):
3. Never reuse anything
This is more or less a natural implication of the previous two points. Another example is trying to reuse the string "by".
4. String ID should not be just the string content
The classic example is: STRING_NEXT = "Next".
Next what? Next page? Next step? In Chinese the translation for "Next" will include the measure word, and Chinese measure words are very different to English.
It would be better to use STRING_NEXTPAGE = "Next".
And yes, this again ties in with the previous points because now it would be weird to use STRING_NEXTPAGE for presenting a next step. So you would have two strings that seem redundant in English:
A good rule of thumb is to use the string ID to describe what that string does. Instead of STR_UPDATEDBY="Updated by {$author}", use STR_BACKUPNOTE_UPDATE="Updated by {$author}".
This way the translator can easily know that this is not just a string that says "Updated by $author", but it will be used as the note in the list of backup data, as opposed to a small note under each post. Again, in this case the suitable translations will differ, it's possible to write a more fitting/natural translation for it.
==
Having said all this, I also have to say that they're just guidelines, not rules set in stone. It's impossible to expect anyone to build a system that can cater for absolutely every language in the world. In fact, my current job that came from a well known global brand has strings that makes every single mistake I have described.
I've been in software localization for more than ten years, working as a freelancer, and have handled stuff for Microsoft, Adobe, Sun, Samsung, Sony Ericsson, Asus, Blackberry etc via my agent. They all make these mistakes, all of them, all the time. In general programmers don't care about localization even though they know they want to launch the product globally. They just don't care. Most translators would be used to it anyway.
So don't worry too much about it. It won't be the end of the world if you ignore these guidelines. Following the guidelines allows your software to have that "polish", if the translator is good. In the end you may decide to, for example, not use placeholders but just the built-in variable substitution of sprintf. You loose some leeway on the translation but makes the system more efficient or more reliable. That's perfectly reasonable.
I hope I haven't been too much of a pain for you all. Thanks for reading.
1. Every string must be meaningful on its own
This necessitates using placeholders. For example, having separate strings for "Page" and "of", is very bad. On the other hand, having one string of "Page {$pagenum} of {$totalpagenum}" is very good.
2. Never force word order
This means the placeholders must be named and can be freely switched around. For example, this is very bad: having "%s by %s, %s" where the three string variables correspond to $action ("Updated" or "Created" or "Posted"), $author, $timeago ("just now" or "yesterday" or "last week").
What should be done is to have 3 strings like this (translation provided):
- "Updated by {$author}, {$timeago}" -> "{%timeago}由{$author}更新"
- "Created by {$author}, {$timeago}" -> "{%timeago}由{$author}建立"
- "Posted by {$author}, {$timeago}" -> "{%timeago}由{$author}發表"
3. Never reuse anything
This is more or less a natural implication of the previous two points. Another example is trying to reuse the string "by".
- {$posttitle} + "by" + {$author} -> {$posttitle} + "作者:" + {$author}
- "by" + {$sortcriteria} -> "依" + {$sortcriteria}
4. String ID should not be just the string content
The classic example is: STRING_NEXT = "Next".
Next what? Next page? Next step? In Chinese the translation for "Next" will include the measure word, and Chinese measure words are very different to English.
It would be better to use STRING_NEXTPAGE = "Next".
And yes, this again ties in with the previous points because now it would be weird to use STRING_NEXTPAGE for presenting a next step. So you would have two strings that seem redundant in English:
- STRING_NEXTPAGE = "Next" -> STRING_NEXTPAGE = "下一頁"
- STRING_NEXTSTEP = "Next" -> STRING_NEXTSTEP = "下一步"
A good rule of thumb is to use the string ID to describe what that string does. Instead of STR_UPDATEDBY="Updated by {$author}", use STR_BACKUPNOTE_UPDATE="Updated by {$author}".
This way the translator can easily know that this is not just a string that says "Updated by $author", but it will be used as the note in the list of backup data, as opposed to a small note under each post. Again, in this case the suitable translations will differ, it's possible to write a more fitting/natural translation for it.
==
Having said all this, I also have to say that they're just guidelines, not rules set in stone. It's impossible to expect anyone to build a system that can cater for absolutely every language in the world. In fact, my current job that came from a well known global brand has strings that makes every single mistake I have described.
I've been in software localization for more than ten years, working as a freelancer, and have handled stuff for Microsoft, Adobe, Sun, Samsung, Sony Ericsson, Asus, Blackberry etc via my agent. They all make these mistakes, all of them, all the time. In general programmers don't care about localization even though they know they want to launch the product globally. They just don't care. Most translators would be used to it anyway.
So don't worry too much about it. It won't be the end of the world if you ignore these guidelines. Following the guidelines allows your software to have that "polish", if the translator is good. In the end you may decide to, for example, not use placeholders but just the built-in variable substitution of sprintf. You loose some leeway on the translation but makes the system more efficient or more reliable. That's perfectly reasonable.
I hope I haven't been too much of a pain for you all. Thanks for reading.