Home
| Calendar
| Mail Lists
| List Archives
| Desktop SIG
| Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU |
On 07/06/2012 02:03 PM, Derek Martin wrote: > On Fri, Jul 06, 2012 at 01:50:04PM -0400, Jerry Feldman wrote: >> On 07/06/2012 01:34 PM, Richard Pieri wrote: >>> On 7/6/2012 11:27 AM, Jerry Feldman wrote: >>>> I've found that sed(1) tends to work well for me in my scripts. What I >>>> do in the scripts is something like: >>> sed works, too. I find tr to be easier/quicker to use than sed for >>> simple transformations. I use sed for more significant edits. >>> >> Someone mentioned unicode. There are a number of unicode to ascii >> converters. > Can you elaborate? I can't see how this would work unless the > "Unicode" file contained only a subset of Unicode which corresponds to > the 7-bit ASCII character set... in which case the Unicode version of > the file will be identical to the ASCII version of the file, possibly > save for a 3-byte encoding marker (which is optional and largely > unnecessary) at the beginning of the file. > Normally, we thing of Unicode as 16-bit (UTF-16). It can be UTF-7 or UTF-8. A true ASCII file is 7-bits. It has been a while since I have played with encodings, but you certainly can express unicode in ASCII by encoding the exceptions as escape characters. In any case, sed or tr work fine when dealing with the normal ASCII text files we see on Linux. -- Jerry Feldman <gaf at blu.org> Boston Linux and Unix PGP key id:3BC1EB90 PGP Key fingerprint: 49E2 C52A FC5A A31F 8D66 C0AF 7CEA 30FC 3BC1 EB90
BLU is a member of BostonUserGroups | |
We also thank MIT for the use of their facilities. |