Skip to content

10 Internationalization tips for developers – I18N checklist

We have been getting more and more requests at Globalme for web internationalization consulting on tips  and checklists. This is great news! Seems like development teams now better realize that developing internationalized desktop and web applications not only makes the job much easier when it is time for localization, but also promotes good development practices even the application is intended to be in a single language (and we all know that even if it was really strong, this intention only lasts until the sales team says “We need this application in 4 more languages”).

This is the beginning of a series of write ups that I am planning to post. This first post will be about generic internationalization tips that apply to any application regardless of the programming language. It will be followed by posts focused on different programming languages such as “How to internationalize ruby on rails applications”, “Internationalization process for asp.net applications”, you get the idea…

Before I get into the internationalization tips, I would like to clarify a terminology confusion that seems to be very common:

What is the difference between internationalization and localization? internationalization vs localization

Internationalization (I18N): is the process of making code ready so that it can be localized to a specific region and language.
Localization (L10N): is the process of adapting the application content to meet the language, cultural and other requirements of a specific target market.

Basically, internationalization is what coders do to have an application ready for the content changes that localizers need to implement (translation, style changes etc.)

10 Internationalization tips for developers

Following the tips below will ensure that you have the grounds covered while you develop. That is, when time comes and management brings in a localization vendor, your code will be almost ready for the requirements. I say “almost” cause there are a few other things that will need to be implemented which I will cover in the upcoming posts (such as language selection). Lets start with the basics…

Externalize all translatable content – Take the text out of the code and place in resource files

This is an essential requirement for a properly internationalized application. Separating the translatable text from the code will avoid code duplication, will let localizers and developers work on updates simultaneously and remove the possibility of damaging code during translation. You are keeping the presentation and business logic separate, right? This takes that a step further and keeps the translatable items separate than the view.

Example in Ruby on Rails:

externalize strings

 

Allow input of international data and foreign scripts

Input fields often do character validation. Validation rules should allow the input of foreign characters. In the example below, input field complains about the “special character” é and does not allow the user to proceed.

foreign characters localization

Another very common example is the validation error for postal codes. If you code the field for US Zip code format, users with a Canadian postal code will not be able to pass the validation. If you do need to validate the zip code, make sure you attach the validation rule to the specific country and do not enforce it on others (or have the validation rule update when country selection changes).

Avoid string concatenation

Concatenation only works when your content is written for a specific language. Avoid constructing strings through concatenation as this makes translation hard – even impossible in certain cases. See the example below – in Java:

 

internationalization string concatenation

This message is not translatable as the order of sentence elements are hardcoded by concatenation. Instead, use named variables which can be moved around like the example below:

 

string concatenation variable

Linguists will be able to move the variable as necessary.

Avoid using a variable in more than one context

Often people think that translation costs can be reduced by reusing strings. Unfortunately this approach leads to extra costs rather than savings. See the example below:

variable context

This form will not work for many languages as the verb will be different depending on the product name (feminine/masculine etc.). As a rule of thumb, do not use a noun as a parameter in a sentence and avoid reusing strings. Translation tools let linguists recycle previously translated strings during the translation pass and therefore will bring these savings anyway.

Do all string handling with Unicode

An internationalized application uses Unicode for all handling of strings and text. This applies to the static text as well as the dynamic text that is communicated between the application and the database. Unicode is a much broader topic than I can fit here. I strongly recommend that you read this great article – The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!).

Make sure the characters don’t get corrupt during input>database>output route:

unicode string handling

Provide extra room for text expansion – User Interface

Translated text expands 30% on average with the exception of some languages where it may shrink. Leave enough room on your layout for expansion and avoid static sizing. If there are strings that should not exceed a certain size, always include comments in the resource file for those items.

A .NET .resx file:

string expansion

 

Add context information to strings using comments when necessary

A string can be translated to a foreign language in many different ways. It is very important to provide context information in the resource file when necessary.

From a gettext .po file:

translator comments

Use system functions for date/time and numeric formatting

Date/time and numeric formatting differ even between the regions that speak the same language. For example, while US use MM/DD/YYYY for date UK use DD/MM/YYYY.

Ruby on Rails example:

regional formatting

Externalize all styles and formatting

Font face, size, style will be different for some languages. In line styling will prevent these modifications to be done or require code duplication (which should be avoided at all costs). Always use external style sheets to define styles for a web application.

You should also avoid using styling tags such as “em” and “strong”. Using italic text, for instance, is not common in Japanese and Chinese. Bold font faces cause problems for these languages as well since bold strokes may result in a big blob of ink when the font size is small.

If you want to emphasize a string with bold font face, do it by externalizing the style. This way, the localizers of Western languages can follow the English emphasize whereas those localizing for Japanese can specify a larger font size.

Use system functions for sorting and string comparison

String sorting is a very important aspect of web and software applications. An internationalized application does not use any manual sorting logic and relies on the underlying framework’s API for string comparison. This applies to database data as well as the strings that come from resource files (you have externalized everything, right?) which may be used in form elements and others such as combo boxes.

.NET example from MSDN

internationalization tips

Conclusion

In this post, I listed the basics of software and web internationalization. As I mentioned at the beginning this is just the tip of the iceberg. In the upcoming posts, we will look at programming language specific details for asp.net, RoR and Java (to begin with). Stay tuned.

Categories: Internationalization.

Tags: , , ,

Comment Feed

One Response

  1. Thanks Emre – great post. We learned many of these lessons the hard, expensive way localising a Rails app – Kyero.com – into multiple languages.

    Recently, we launched LocaleApp.com to help other Rails developers localise their apps, and benefit from our mistakes.

    It solves the problems of hand-editing YAML resource files and allows developers, content owners and translators to collaborate online.

    We’re in public beta right now and we’d really value your feedback.



Some HTML is OK

or, reply to this post via trackback.