![]() |
Home
| Calendar
| Mail Lists
| List Archives
| Desktop SIG
| Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU |
Nicholas Bodley wrote: > Imho, read and heed! I didn't know that. I'm extremely unlikely to > send e-mail in Japanese, but it's one of those essentials (like > knowledge of BCC) one really has to keep in mind when sending e-mail. > > As I understand it, (and I might well be wrong! Corrections welcome!) > there are at least two basically-different ways to encode Japanese > text; iirc, one (Shift-JIS? Apologies if I'm wrong) is something like > the old {ltrs}/{figs} shift in 5-bit teleprinters -- one can be in the > wrong mode. The consequence is that if a "mode-change" character is > omitted, or wrongly sent when it should not be, (or munged...), all > subsequent text (at least up to a redefining of "mode") is scrambled > badly. If you think seeing English text in {figs} shift is bad, when > you have a practical set of something like 2,300 or so > basically-Chinese characters, and are receiving nonsense, as I > understand it, that's mojibake. There are actually several different encodings. Shift_JIS (Microsoft SJIS) is primarily used for web pages and other documents on Windows systems. ISO-2022-JP is used for e-mail. There's also EUC-JP (Extended Unix Code) which used on Unix systems. Universal encodings like UTF-8 and UTF-16 are also used. I am a big fan of UTF-8 because it supports multiple languages (East Asian, Arabic, Hebrew, Thai, English, etc...) and efficiently handles ASCII (as single bytes.) A great resource on this subject is the book, "CJKV Information Processing" by Ken Lunde. > [Katakana] > > One can read more Japanese than one might, at first, expect. Japan has > imported English words "wholesale", sometimes adapting them to their > own language (I'm typing on a Compaq "pasokon" -- pasonaru > konpyuutaa). Perhaps 35,000 words have been imported. These words are > rendered/written with a simple syllabary called katakana, which > (except for arbitrary-seeming, never-complicated character shapes) is > about as easy to learn* as an alphabet, and can be a *lot* of fun. Both Katakana (foreign words) and Hiragana (native Japanese words) are phonetic so they are easy to learn. Kanji is also interesting but to be literate you need to learn a few thousand characters which is quite a task.
![]() |
|
BLU is a member of BostonUserGroups | |
We also thank MIT for the use of their facilities. |