Friday, August 4, 2017

One technical approach to one bilingual website.


I am updating my website, and I want it to be bilingual, supporting Finnish.

The technical solution I'm thinking to employ will be to serve the same page irrespective of language, but alter its content on the backend before serving it. Since the Finnish version will be hosted at fi.rendall.tv, but it essentially just points to rendall.tv it's a simple matter to see if the request URI is prepended with fi, and use that information as the "use the Finnish version" flag.  Otherwise, just serve whatever page is asked for by default.

As an overview, to serve the Finnish version, I will most often just take the English version, and swap out each instance of English content and replace it with a Finnish language translation.  On occasion it might be necessary to present an entirely uniquely formatted page for the Finnish version.  And, of course, if there is no Finnish language version, I'll fall back to English.

So, in some detail, every html request to rendall.tv will be received by a controller that first checks to see if the request URI has a 'fi' at the beginning.  If not, it'll just pass through the request.  If there is a fi in the URI, the controller will first check to see if there is a Finnish-language version of that page: the same filename, but with .fi appended to the filename (e.g. fi.rendall.tv/index.html request will look for index.fi.html), and if so, will serve that directly (as index.html). That covers the comparatively rare case of complete Finnish re-formats.

If there is not a full Finnish version of that page, the controller will then look for a similarly named static file (e.g. index.fi.txt) that will map a bunch of key-names to translation values.

eg:  "howdy" : "Ohoi!"


This static file will be used to alter the requested file and present a Finnish version.  HTML tags with the attribute data-translation="<key-name>" get their inner HTML replaced by  <translation-value> in accordance with the mapping.

For instance,
<span class="display" data-translation="howdy">Howdy!</span>

magically transforms into
<span class="display" data-translation="howdy">Ohoi!</span>

It's important that there be an option to alter the actual inner HTML, not only the text.

The sentence "Welcome to the personal and professional website of Rendall Koski" must have for stylistic reasons a nbsp; between my first and last name.  The font I'm using for the welcome banner, Reenie-Beanie, does not have a hard space.  That is to say, nbsp; renders with no space at all. Reenie-Beanie renders Rendallnbsp;Koski as RendallKoski, not Rendall Koski.  The fix is to wrap the nbsp; in a <span> with a .nbsp-fix class that has a different font-family (that does render nbsp;) like so:

Rendall<span class="nbsp-fix">nbps;</span>Koski

Now my name renders properly, and won't break apart.  Problem solved. For English.

But!  The translation for "Welcome to the personal and professional website of Rendall Koski" in Finnish is "Tervetuloa Rendall Koskin persoonallinen ja ammatillinen verkkosivuun"*. If my scheme only allows for straight text translations and no HTML, swapping this in would blow away my span.nbsp-fix above, and then I'm back to having either no nbsp; or stupid-looking RendallKoskin.

Therefore the translation scheme should also cover HTML, so that this mapping
"welcome": 'Tervetuloa Rendall<span class="nbsp-fix">&nbsp;</span>Koskin persoonallinen ja ammatillinen verkkosivuun!'

tranforms the line: 
Welcome to the personal and professional website of Rendall<span class="nbsp-fix">&nbsp;</span>Koski

properly into this line:
Tervetuloa Rendall<span class="nbsp-fix">&nbsp;</span>Koskin persoonallinen ja ammatillinen verkkosivuun!

* I have not yet verified these translations with a native Finnish speaker.