[Discuss] "M-" notation

Fri Jul 6 14:17:25 EDT 2012

On 07/06/2012 02:03 PM, Derek Martin wrote:
> On Fri, Jul 06, 2012 at 01:50:04PM -0400, Jerry Feldman wrote:
>> On 07/06/2012 01:34 PM, Richard Pieri wrote:
>>> On 7/6/2012 11:27 AM, Jerry Feldman wrote:
>>>> I've found that sed(1) tends to work well for me in my scripts. What I
>>>> do in the scripts is something like:
>>> sed works, too.  I find tr to be easier/quicker to use than sed for
>>> simple transformations.  I use sed for more significant edits.
>>>
>> Someone mentioned unicode. There are a number of unicode to ascii
>> converters.
> Can you elaborate?  I can't see how this would work unless the
> "Unicode" file contained only a subset of Unicode which corresponds to
> the 7-bit ASCII character set... in which case the Unicode version of
> the file will be identical to the ASCII version of the file, possibly
> save for a 3-byte encoding marker (which is optional and largely
> unnecessary) at the beginning of the file.
>
Normally, we thing of Unicode as 16-bit (UTF-16). It can be UTF-7 or
UTF-8. A true ASCII file is 7-bits. It has been a while since I have
played with encodings, but you certainly can express unicode in ASCII by
encoding the exceptions as escape characters.

In any case, sed or tr work fine when dealing with the normal ASCII text
files we see on Linux.

-- 
Jerry Feldman <gaf at blu.org>
Boston Linux and Unix
PGP key id:3BC1EB90 
PGP Key fingerprint: 49E2 C52A FC5A A31F 8D66  C0AF 7CEA 30FC 3BC1 EB90