Tuesday, September 20, 2011

A Tragedy: When Localization met Interpolation (repost)

This is a repost from my old blog. When I end-of-lifed that Blog, I said might pull some of those over here. Lukas's comments on the previous Syntactic Tartness for Macro Expansion reminded me of this one. At the time I wrote that, I failed to give credit to Steve Dahl, who I had spent about 2 days kicking ideas back and forth with on Skype about this.

Adrian Kuhn commented in a previous post:

"In general, I think that Smalltalk desperately misses String interpolation!"

Yes, I agree. But the devils running amuck in the details. The problem is that I18n translation made it first. There's two ways that I've seen to do translation: 1) statically recompile your program, making a sweep of all (or subset) of the strings in the program and translating them 2) doing translation at runtime, by looking up the string in some database. The first is pretty old school. :)

The challenges with String Interpolation are a few:

Far Reaching Changes


You need to modify the compiler. You might be able to try to have a unary message that evaluates a string, compiling expressions on the fly in the context of the sender. In this case, the tools can do nothing to help you get the code right. They get confused, because you create variables that don't appear to be referenced. You can't do it right in the context of clean closures. So a parser/compiler change is definitely in order. And don't forget the usual rant, it's not enough to modify just the base compiler. You've got to look at it in the context of the debugger's ability to put breakpoints as well as step through code, in the context of the RB parser, which you want for rewrites and formatting, as well as code highlighting maybe. This does not make it undoable. It simply means it's not a quick thing you can whip up over the weekend.

What Is It When?


So lets say you have the expression:
string := '[[customer name]] is [[customer age]] years old'.

If you use it as the argument of a nextPutAll: send, you probably want it to be in expanded form. But, if you want to look it up in an I18n dictionary, then you would rather have it in symbolic form.
You could detect string literals with whatever the expression preamble is ([[ in the examples above) at compile time and build some sort of complex literal that held both blocks as well as a string. But you'd still have to figure out how to resolve the message =. In an assert: form, you'd probably want expanded, but at dictionary look up time for translation you'd want the compressed form.

What Happened Elsewhere


It's been interesting to spend a little time looking at how Ruby and Python do these. This page makes it clear that the Ruby gettext guys also understood that it is hard to have your cake and eat it too when it comes to mixing interpolation and localization. The answer isn't as definitive with Python, but I think I've convinced myself it's a similar story after looking at docs for a little while.

No comments:

Post a Comment