There are two kinds of software internationalization you can refer to - built in to the product from the start, and performed on existing code. The kind of internationalization (i18n) this article invokes isn’t the sort that’s designed into a product right from conception. That is less common, though the pull of global markets is changing that tide. Few application development teams have historically had the opportunity to incorporate world market foresight. They had to produce a product to market for the most immediate business requirements. So then most internationalization happens on existing code because someone sells something, a global company buys another company, or a strategic initiative has taken form. Suddenly there is a new requirement for software to work in any number of new languages and locales. Business requirements drive technical schedules first, rather than involving a creative path of inventing new cool functionality or products from the ground up.
I’m tempted to just write Don’t Panic, carry a towel and avoid Vogon poetry - and while you’re at it, Unicode’s pretty good stuff. I’m being flippant because internationalization efforts tend to each have their own unique challenges when you get into the details. I’ll instead provide this article as a series of i18n process tips that apply across the board. In general Internationalization (i18n) is messy, full of exceptions, and generally not considered optimally from a development perspective. Maybe that should be tip one.
Tip One: Internationalization is ugly. Expect that from the start. You are reverse engineering basic logic of how your software inputs, stores, retrieves, transforms and displays data. You are adding user interaction functionality that your product wasn’t originally designed to do. It’s rarely just about embedded strings. There are a lot of things that can go wrong. It’s a lot of work. In some cases you can run into weird stuff from areas such as compilers, middleware, database connectivity, and even low level operating system issues.
Tip Two: Get the big picture questions handled quickly. That is, what are the high level requirements, how much time do you have, how much time do you need and how much budget can you get? Be prepared to ask for what you need in the CFO’s or CEO’s language.
Tip Three: Remember what’s driving this - Revenue. Internationalizing a complex application is a big new requirement. Don’t underestimate. Being late will cause delays in revenue, stall marketing and sales investments and make you very unpopular. Do it poorly and rushed, and your product will be shabby for the very new customers you seek.
Tip Four: Do some good research or get help identifying requirements. For instance, consider language only as one aspect of a locale. English is a language. Yet England is a different locale, with different expected behavior than the States. Consider numerical formats, dates, times, postal addresses, phone numbers, paper size, currencies and more. Then add the specifics that your application may need, like any possible customizations of workflow, locale selection and more. Consider what the optimal character encoding implementation strategy is for your computer platforms, application tiers, programming languages, database requirements, etc.
Tip Five: Get some good code intelligence. Tools like our Globalyzer software let you comb through your source and identify all kinds of internationalization issues right up front. It’s way better to get a good inventory of what you need to inspect and change, rather than hunting through your myriad lines of code trying to anticipate all kinds of variable conditions using grep, and then trial and error your way through the boatloads of issues you’ll miss.
We are just adding a new capability to Globalyzer called Diagnostics. It will give you summary information internationalization readiness and issues found in your code. It’s fully functional even with just a trial Globalyzer license. No excuses, it’s free to use all you want.
Tip Six: Prepare for nests of difficulties depending upon your programming language(s), database and third party products. Programming languages rate differently in terms of difficulty to internationalize. For instance C and C++ are harder, with many hundreds of potential issues, compared to Java and C#, which have quite a bit of internationalization baked in. But Java and C# don’t internationalize themselves. You have to use their frameworks, which are very capable. The good thing is that when a programming language has well designed internationalization capability, the work goes faster.
Tip Seven: Third party products can cause some challenges. They are not always built for your new internationalization needs. For instance, a couple of years ago we worked on a product that used a third party product for displaying animations in a kid’s game. At first glance, you wouldn’t think it would be an issue, as there was no text being processed or displayed. But when we looked at things more closely, user name and file path info was being passed into the animation tool, which in this case could very well involve wide characters (e.g. Chinese). But the particular version of the animation product, could not support this and so it would always crash. The fix took time and some inventiveness.
Another example involved a third party product that generated a spreadsheet view. While data within the cells was handling Kanji just fine, tabs were corrupting. The third party product provider had declared their product Unicode compliant, but in practice it wasn’t done all the way through. The choice became to find a better third party product to replace this one, or get the spreadsheet provider to fix their product -which they may or may not want to do on your schedule.
Tip Eight: Remember your i18n fundamentals. Don’t embed strings or concatenate them. Watch out for sorting. A and Z are not the beginning and end of all alphabets - some languages don’t use the concept of alphabets. Don’t hardcode fonts. Remember your interface Geometry will need to expand. Use functions, methods or classes that adapt to locale needs. Use Locale adapting sorting (i.e. java.text.Collator class in Java) or let your database perform sorting for you whenever possible.
You can automate aspects of repetitive like string externalization using Globalyzer. It makes that tedious job go much faster.
Tip Nine: Account for merging code with parallel feature developments. This can be tricky, as your new feature development cycles could be quite different from your internationalization milestones. In most cases, be prepared to branch the code for internationalization efforts.
Tip Ten: Use Pseudo Localization (PseudoJudo in Globalyzer) to perform many internationalization functional tests before your localize. That means you add pad characters from target locales to the beginning and end of strings, and stretch the whole string based on target requirements. You’ll then be able to see how those strings behave in your display and moving through application tiers, without your engineers needing to understand the target language.
Bonus Tip Eleven: Plan for QA to take longer than it did when your app was just monolingual. Remember, you have internationalization functional testing and bug fixing, with new testing cases, and then, should you be localizing, you have linguistic testing.