BLU Discuss list archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

i18n

Subject: i18n
From: sethg at ropine.com (Seth Gordon)
Date: Fri, 17 Mar 2006 10:38:17 -0500
In-reply-to: <1142609066.3278.315.camel@ernie>
References: <200603161303.04338.gaf@blu.org> <441A50A5.2040501@comcast.net> <op.s6jrvvloymp22i@dsl092-074-189.bos1.dsl.speakeasy.net> <200603170855.50148.gaf@blu.org> <1142609066.3278.315.camel@ernie>

Ed Hill wrote:
>>
>>The problem with Unix/Linux is that it is still based on 8-bit characters, 
>>and an internationalized program must be set up to use either 16-bit or 
>>wider. Java was written where it's native character type is 16-bits which 
>>is sufficient for a majority of languages, but not for Asian languages.
> 
> The above, as written, is simply not true.  UTF-8 is a perfectly valid
> Unicode encoding and, for the characters that match the ASCII 0x00 to
> 0x7F, it uses the *identical* 8bits/character encoding and is therefore
> largely (read: as much as possible) backwards-compatible with older
> programs, text files, etc.

The standard Unix string-handling libraries don't know from UTF-8, so, 
for example, they will assume that every character is one byte wide.

You could encode "avi)B?n.txt" in UTF-8 and use it as a file name, and a 
terminal window configured to use UTF-8 would be able to display that 
name.  But in order for "ls avi?n.txt" to work, the shell's globbing 
algorithm would have to recognize that "\xc3\xb3" is the single UTF-8 
character ")B?" (and not, say, the two ISO-8859-1 characters "$)B??").

References:
- Japanese characters on OO.o presentation
  - From: gaf at blu.org (Jerry Feldman)
- Japanese characters on OO.o presentation--> i18n
  - From: robertlaferla at comcast.net (Robert La Ferla)
- i18n
  - From: nbodley at speakeasy.net (Nicholas Bodley)
- i18n
  - From: gaf at blu.org (Jerry Feldman)
- i18n
  - From: ed at eh3.com (Ed Hill)

Prev by Date: i18n
Next by Date: i18n
Previous by thread: i18n
Next by thread: i18n
Index(es):
- Date
- Thread

Boston Linux & Unix / webmaster@blu.org