[TRex documentation]

aMail::L10N

Developer's arena


NAME

TRex::L10N - Localization package.


SYNOPSIS

    use TRex::L10N;
    ## Gets object reference for language French
    $language = TRex::L10N->get_handle('fr');
    ## Prints the pre-stablished text (Finished) in French
    print $language->maketext("Finished")


DESCRIPTION

This module allows the usage of aMail in many different languages.

Basically a new module for each language must be used which contains a lexicon (a hash of some pre-defined texts in English and its translation to the desired language).

Each lexicon is implemented as a Perl module (.pm) under the directory L10N (e.g. L10n/es.pm, L10N/fr.pm, etc.). As such these modules construction combines the language translation skill plus Perl programming.

What do you mean with localization ?

In aMail 2.1 began a process called localization (L10N for short) in which, basically, a program is adapted to each user local conditions: the first one attacked was language.

In the original version (mantained as aMail-original) all text shown to the user was written out in English, so if a translation to other language must be done a complete aMail re-write is needed, and a different version for each language must be maintained. This is by no means practical, so some mechanisms are used to avoid these.

And how will you do that ?

One solution could be to pass each sentence through a translator and show its output (the English translator is a dummy one that returns the same sentence entered). The problem with this approach is that the resources needed to run a translator, just to translate a sentence with 5 to 10 words, makes it a heavy-and-resources-hungry task.

If we have in mind that the texts are a bunch of sentences consisting of one to ten or fifteen words, the translation can be achieved with a lot less resources if the sentences are pre-translated, and indexed in some way. In this case you only need to indicate the sentence number to print, or something in the style :

    @Trans = {
           'Hello, World!',
           'Have a nice day'
    }
    print $Trans[0] . "\n";
    print $Trans[1] . "\n";

will print :

    Hello, World!
    Have a nice day

If @Trans is changed with sentences in other language (e.g. Spanish) :

    @Trans = {
           'Hola mundo!',
           'Que lo pases bien'
    }
    print $Trans[0] . "\n";
    print $Trans[1] . "\n";

will print :

    Hola mundo!
    Que lo pases bien

As you can see the program logic (print sentences) aren't tied to the user language. There's additional drawbacks (such as plural, verb tenses and so on) that is a bit more efficient (and flexible) to work with the so called lexicons.

And what about aMail implementation ?

In aMail the option used was lexicons, implemented through Locale::Maketext (thanks Sean M. Burke). Basically it works with hashes (lexicons) where the hash value is the text in English and the content is the text translated to the target language.

In aMail there's a hub class (TRex::L10N) that receives the object creations (just one handle getting for the desired language).

    #!/usr/bin/perl
    # ****** USE THIS STUFF *******
    use CGI;         my $amail=CGI->new(); # THIS IS THE MAIN CGI OBJECT
    use TRex::Common;
    use TRex::L10N;
    # ****** SET THESE VARIABLES *******
    my ($dir_URL, $db_path, $img_URL, $skin_URL) = Set_location;
    my $language = TRex::L10N->get_handle(Get_language());

After this initial step, each message to be show must be passed to the maketext method, that returns the text translation in the previously selected language (this choice is made under TRex::Common, in the $lang variable).

    print $language->maketext("Add Contacts");

The magic in the shadows is that one class for each language is implemented that contains the translated sentences, and are located under TRex::L10N. As examples TRex::L10N::en and TRex::L10N::es are shown:

    -----------------------------------------------------
    package TRex::L10N::en;
    use Lingua::EN::Inflect;
    use TRex::L10N;
    @ISA = qw( TRex::L10N );
    %Lexicon = (
    ...
        "Add Contacts"
        => "Add Contacts",
    ...
    )
    -----------------------------------------------------
    package TRex::L10N::en;
    use Lingua::EN::Inflect;
    use TRex::L10N;
    @ISA = qw( TRex::L10N );
    %Lexicon = (
    ...
        "Add Contacts"
        => "Agregar Contactos",
    ...
    )
    -----------------------------------------------------

Must I implement multiple lexicons for plurals ?

You're not the first one to think in this problem (hopefully for the rest of the world and me), so it was solved. Into the lexicons not only values can be stored, also subroutines that perform the transformations (a.k.a. translations) according with each languages r ules. Let's look at the next example :

    $address_mess = $language->maketext("Added [quant,_1,address]" .
                               " to your address book", $add_count);

the corresponding programmed translation (in English and Spanish) are :

    -----------------------------------------------------
    package TRex::L10N::en;
    use Lingua::EN::Inflect;
    use TRex::L10N;
    @ISA = qw( TRex::L10N );
    %Lexicon = (
    ...
        "Added [quant,_1,address] to your address book"
        => "Added [quant,_1,address] to your address book",  
    ...
    )
    -----------------------------------------------------
    package TRex::L10N::es;
    use Lingua::EN::Inflect;
    use TRex::L10N;
    @ISA = qw( TRex::L10N );
    %Lexicon = (
    ...
        "Added [quant,_1,address] to your address book"
        => "Se agregaron [quant,_1,direcci—n] a la Libreta de Direcciones",
    ...
    )
    -----------------------------------------------------

The $add_count variable contains the amount of accounts to be added, so that the quant() method is called with $add_count and 'address' as parameters, returning the singular or plural of 'address' depending on $add_count value.

Now, the real magic about plurals is provided inside the numerate() method, which can (MUST!!) be override for each language implementation to return the plural form for the specified word. In english there is implemented a class Lingua::EN::Inflect (thanks Damian Conway) that implements a lot of english methods for plurals, word comparison, etc.

As an example, the numerate override in TRex::L10N::en :

    ########## numerate for TRex::L10N::en
    ##--- returns the plural of the specified word (overrides Locale::Maketext::numerate)
    sub numerate {        
        my($handle, $num, @forms) = @_;
        my $s = ($num == 1);
        return '' unless @forms;
        if(@forms == 1) {                          ## only the headword form specified    
            return $s ? $forms[0] : Lingua::EN::Inflect::PL($forms[0]); 
        } else {                                   ## singular and plural were specified
            return $s ? $forms[0] : $forms[1];
        }
    };

I've been not so God blessed in spanish (no Lingua::ES::inflect or similar), so I generate a couple of methods for TRex::L10N::es :

    ########## numerate for TRex::L10N::es
    ##--- returns the plural of the specified word (overrides Locale::Maketext::numerate)
    ##
    ##    I'm trying to work with one of the local schools to generate a little project
    ##    to make the class Lingua::es where all regarded to verb tenses, plurals, etc.
    ##    are coded to get the most of Spanish language and Perl.
    ##    In the mean time, I'll use a lookup table for the words needed. :-((
    ##
    ##                                                    Bit-Man (Sept-2000)
    sub numerate {        
        my($handle, $num, @forms) = @_;
        my $s = ($num == 1);
        return '' unless @forms;
        if(@forms == 1) {                          ## only the headword form specified    
            return $s ? $forms[0] : get_plural($forms[0]); 
        } else {                                   ## singular and plural were specified
            return $s ? $forms[0] : $forms[1];
        }
    };
    sub get_plural($) {
        my %lookup = ( 'direcci—n'      => 'direcciones',
                       'mensaje'        => 'mensajes',
                       'carpeta'        => 'carpetas',
                      );
        return $lookup{$_[0]};
    }

So ... no more wired programming ?

Well, that's no completely true. Inside a hash you can put an anonymous subroutine, and it will be executed and eval()ed by Language::Maketext, just you need to catch the parameters and work with them as desired. In the next example, the sentence of adding addresses to a group is translated to Spanish :

    "Added new [quant,_1,address] to the group [_1]"
    => sub {
        my ( $handle, @param ) = @_;
        my $quant = $handle->quant( $param[0], 'direcci—n');
        $quant  =~ s/$param[0] //;
        if ( $param[0] == 1 ) {      #### singular form
            my $verb = 'agreg—';
            my $adjective = 'nueva';
        } else {                     #### plural form
            my $verb = 'agregaron';
            my $adjective = 'nuevas';
        }
        return "Se $verb $param[0] $adjective $quant al grupo $param[1]"
    },

``Added 1 new address to the group'' translates as ``Se agreg— 1 nueva direcci—n al grupo'', instead ``Added 2 new addresses to the group'' translates as ``Se agregaron 2 nuevas direcciones al grupo'' (take note that the verb is ``agreg—'' in singular form and ``agregaron'' in plural form), so the sub must manage the plural form of the verb (``agregaron'') and the plural form of the adjective (``nueva'' is the singular form and ``nuevas'' is the plural one).

New languages, the final frontier

Now you have all the tools to enjoy aMail in your preferred language. It seems a lot at the beginning, but you don't need to do it at once, just start by translating the simplest sentences (the ones that doesn't use pluraks and so on), then move to more challenging ones and so on (just one step at a time !!).

Finally, any help you need feel free to contact us through our Tech Support Manager or subscribing to aMail-devel list.


[TRex documentation]

aMail::L10N

Developer's arena