What are i18n and L10n?
i18n is the short form for Internationalization because the are 18
characters between the "i" in the beginning and the "n"
at the end. By the same token, L10n is the short form for Localization.
What are locales?
A locale is a collection of standard settings, rules and data specific
to a language and geographical region. For example, fr_CA, the Canadian
French locale is different from fr_FR, the France French locale.
What activities are typically
involved in a localization project?
A localization project usually involves the following activities: Terminology
setup > Translation of Software > Translation of Online Help and
Documentation > Linguistic Review > Engineering and Testing of
Software > Screen Captures, Help Engineering and DTP of Documentation
> Product QA and Delivery
What is pseudo-localization
A method use to identify hard coded message in source codes by putting
double-byte or accent characters prefix and suffix to each translatable
text. When the program is run against the pseudo-translated text, messages
without the prefix and suffix will be revealed. These messages must
have been hard coded. For a detail descriptions on this technique, please
refer to the Internationalization
(i18n) Testing page of SUN Microsystems.
What is translation memory?
A technology that enables the translators to store phrases or sentences
in a database. It is a tool to enhance the consistency of the translation
and help to reuse translated phrases. Translation memory tools vendors
include: Trados, STAR
Transit, and Déjà Vu,.
What are properties files?
A Java terminology that is the synonym of message catalog.
What are message catalogs?
A file that contains the extracted GUI messages.
What is an encoding system?
Computers use numbers to represent characters seen on the screen. An
encoding system is the mapping of numbers to characters. For example,
the ASCII encoding system uses 97 to represent the character "a"
and 233 to represent the character "é".
What is DBCS?
Short for Double-Byte Character Set, DBCS is the character encoding
system that uses one or two bytes to represent a character. Languages
using double-byte character sets are Chinese, Japanese, and Korean --the
so called CJK languages.
What are GB2312, Big-5,
GB2312 is the encoding system for Simplified Chinese; Big-5 is the encoding
system for Traditional Chinese used in Taiwan and Hong Kong; Shift-JIS
is one of the encoding systems for Japanese.
What is Unicode?
A 16-bit character encoding that encompasses all known characters and
used as a worldwide character-encoding standard.
What is UTF-8?
An encoding form of Unicode that supports ASCII for backward compatibility
and covers the characters for most languages in the world. UTF-8 is
short for 8-bit Unicode Transfer Format.
How to write programs
to work on both Windows NT/2000/XP and Windows 95/98/ME?
Windows NT/2000/XP use Unicode as the internal encoding. But Windows
95/98/ME still use the legacy encoding systems. For programs to work
on both NT and 98, for example, translation between encoding systems
has to be done. Microsoft has design the Microsoft Layer for Unicode
(MSLU) which makes the translation pretty painless. Please check the
detail information from MSDN at /www.microsoft.com/GLOBALDEV/articles/mslu_announce.asp