libintl-perl

Home -> libintl-perl -> 2005 -> August

 Problem with untranslated 8bit msgids 
Login Login Subscribe Subscribe  Date  08/22/05 15:22:12 GMT
 From  Bruno Haible
 Subject  Problem with untranslated 8bit msgids
 Previous Thread
 Next Thread
 Start of Thread
 Reference
 Previous Reply
 This Message
 Reply
 Next Reply
Guido Flohr wrote:
> Untranslated strings are
> passed through unmodified in the original character set from the source
> code, whereas translated strings are converted to the character set of
> the selected locale.

This is correct. It is also documented in the manual:

   "Note that the MSGID argument to `gettext' is not subject to
   character set conversion.  Also, when `gettext' does not find a
   translation for MSGID, it returns MSGID unchanged - independently of
   the current output character set.  It is therefore recommended that all
   MSGIDs be US-ASCII strings."

> A possible fix depends on our ability to determine the msgid character
> set.  Evaluating po headers (eventually fed with character set
> information from xgettext --from-code) is not an option; the example
> shows, that there maybe is no mo file at all that can be sourced.

Correct. You might get a translation from de.mo but access the metainfo
de_AT.mo, leading to inconsistencies.

> On the other hand, the above example is perfectly legal usage, and using
> non-English non-ASCII msgids is no longer deprecated.

Who said so? The GNU gettext manual recommends ASCII-only msgids, regardless
of the language.

> 1) Only msgids encoded in UTF-8 are supported.
>
> 2) A new function bind_textdomain_input_codeset is introduced, allowing
> the programmer to specify the character set of the msgids in the
> program.  If the function is not called, no default will be assumed, and
> therefore no output conversion on msgids done.
>
> Option 1 has backwards compatibility issues, I prefer option 2.

You have also the following option, regardless whether your msgids were
in ISO-8859-15 or in UTF-8 originally:

  3a. Change your po/Makefile so that the PO files are converted to UTF-8
      just before being converted to a .mo file. For example, in
      po/Makefile.in.in change

      cd $(srcdir) && rm -f $${lang}.gmo && $(GMSGFMT) -c --statistics -o
t-$${lang}.gmo $${lang}.po && mv t-$${lang}.gmo $${lang}.gmo

      cd $(srcdir) && rm -f $${lang}.gmo && msgcat -t UTF-8 $${lang}.po |
$(GMSGFMT) -c --statistics -o t-$${lang}.gmo - && mv t-$${lang}.gmo

  3b. Use a wrapper function around gettext that does the conversion.

      If your source character set was UTF-8:

      char *my_gettext (const char *msgid)
      {
        char *translation = gettext (msgid);
        if (translation == msgid)
          translation = iconv_string (translation, "UTF-8", nl_langinfo
(CODESET));
        return translation;
      }

      The iconv_string function is a convenience wrapper around iconv(),
      found in gnulib.

      If your source character set was ISO-8859-15:

      char *my_gettext (const char *msgid)
      {
        char *utf8_msgid = iconv_string (msgid, "ISO-8859-15", "UTF-8");
        char *translation = gettext (utf8_msgid);
        if (translation == utf8_msgid)
          translation = iconv_string (translation, "UTF-8", nl_langinfo
(CODESET));
        return translation;
      }

      All this code ignores memory leak issues; take care yourself.

Bruno
Attachments
 1  +-index.html message/rfc822  

ATTENTION: HTML attachments to this mail have been converted to plain text to prevent you from possibly malicious HTML files. Other attachments are included here without any checking. Choose your own poison! The maintainers of this site cannot be held responsible for any damage caused by these attachments.

 Problem with untranslated 8bit msgids
 Previous Thread
 Next Thread
 Start of Thread
 Reference
 Previous Reply
 This Message
 Reply
 Next Reply
 
 08/21/05 12:49:28 GMT  Guido Flohr
 08/22/05 15:22:12 GMT  +--Bruno Haible

Powered by Imperia
Home | Top | Imprint