libintl-perl

Home -> libintl-perl -> 2005 -> August

 Problem with non-ASCII message id's 
Login Login Subscribe Subscribe  Date  08/18/05 09:05:35 GMT
 From  Guido Flohr
 Subject  Problem with non-ASCII message id's
 Previous Thread
 Next Thread
 Start of Thread
 Reference
 Previous Reply
 This Message
 Reply
 Next Reply
Hi,

Jörn Reder wrote:
> [ Cut a lot. ]
>
> Locale::TextDomain misses setting the utf-8 flag on the returned string.
> If the utf-8 flag was set, Perl had converted the output to iso-8859-1 when
> printing to my iso-8859-1 terminal.

It doesn't miss setting the flag, it explicitely unsets it.  See the
documentation for Locale::Messages(3pm), function gettext().

Apart utf-8 flag advocacy or criticism: Libintl-Perl aims to be compatible
with GNU gettext _and_ libintl-perl will try to use the XS version of the
gettext family of functions if possible.  GNU gettext provides no function to
find out whether the output string is in UTF-8 or not.  Instead of making a
lucky guess I decided to always explicitely turn the flag of, because it
cannot be decided.

> Locale::TextDomain misses setting the utf-8 flag on the returned
> string. If the utf-8 flag was set, Perl had converted the output to
> iso-8859-1 when printing to my iso-8859-1 terminal.

If your terminal only understand iso-8859-1, then you should choose a
iso-8859-1 locale ...

> Is this behaviour intended, resp it's no bug but a feature I just
> don't understand?  ;)

There is a misunderstanding.  The character set information in a po file is
only used to find out the encoding of po resp. mo file.  This encoding is the
_source_ encoding for a conversion to the user-defined (via personal locale
settings) destination character set.

If your terminal expects iso-8859-1, then you should choose an iso-8859-1
locale (or set $ENV{OUPUTCHARSET} to "iso-8859-1") and everything will work
fine.  If you prefer a utf-8 locale, then your terminal should be able to
grok with it.

The real problem is that libintl-perl will only convert the character set of
translations.  When it returns the original msgids, it will leave them alone.
I would consider this a bug if the behavior of GNU gettext differs, but I
have to test it out first.

What you can do is provide an explicit po file for en_US where all msgids are
identical to the msgstrs.  Not very nice, but I really think that this would
also be necessary for the XS versions.

Regard,
Guido
--
Imperia AG, Development
Leyboldstr. 10 - D-50354 Hürth - http://www.imperia.net/
Attachments
 1  +-[no description] multipart/signed  
 2    |-index.html message/rfc822  
 3    +-OpenPGP digital signature application/pgp-signature  

 Download OpenPGP digital signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFDBE9VOo0HNPWNDz0RAktGAKCy5vcXh1WtVbvB5tVl8g8XW8fL7ACg3YH8
3B21Tr1rMIaJcOqeFxFZSzM=
=zZ/c
-----END PGP SIGNATURE-----

ATTENTION: HTML attachments to this mail have been converted to plain text to prevent you from possibly malicious HTML files. Other attachments are included here without any checking. Choose your own poison! The maintainers of this site cannot be held responsible for any damage caused by these attachments.

 Problem with non-ASCII message id's
 Previous Thread
 Next Thread
 Start of Thread
 Reference
 Previous Reply
 This Message
 Reply
 Next Reply
 
 08/16/05 09:09:12 GMT  JörnReder
 08/16/05 12:58:20 GMT  +--Guido Flohr
 08/16/05 14:01:13 GMT    |--JörnReder
 08/16/05 15:22:33 GMT    |  +--Guido Flohr
 08/17/05 11:00:11 GMT    |    |--JörnReder
 08/18/05 07:30:54 GMT    |    +--JörnReder
 08/18/05 08:20:43 GMT    |      +--JörnReder
 08/18/05 09:05:35 GMT    |        +--Guido Flohr
 08/17/05 08:29:41 GMT    +--Bruno Haible

Powered by Imperia
Home | Top | Imprint