Home
| Calendar
| Mail Lists
| List Archives
| Desktop SIG
| Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU |
You know who's totally psyched about this email? Susan Cutright and Rebecca Sniderman... On Sun, Sep 19, 2010 at 8:01 PM, <jc-8FIgwK2HfyJMuWfdjsoA/w at public.gmane.org> wrote: > Dan Ritter wrote: > | antiword is the usual candidate. Every one of Google's first ten > | results for that are relevant. > > Yeah, I thought of that, too, but I was hoping there might be something ?that > does ?a ?better ?job. ? In ?one of my current sample .doc files, for example, > antiword produces the curious table entry: > > | CUTRIGHT, Susan ? ? ? ?|11 Arlington Road ? ?| (781)209-9877 ? ? ? ? ?| > | ? ? ? ? ? ? ? ? ? ? ? ?|Waltham, MA ?02453 ? |susan.cutright at ASPENTECH| > | ? ? ? ? ? ? ? ? ? ? ? ?| ? ? ? ? ? ? ? ? ? ? |.com ? ? ? ? ? ? ? ? ? ?| > > Note the "wrapping" of the email address, with the ".com" on a separate line. > When Word displays this on a Windows screen, this wrapping doesn't happen. > The 3rd column strings are actually centered, and the email address is > whole. > > After a bit of exploring, I found that the -w option works to get a wider > "page" size, and this entry actually works, but others in the file don't. > When I tried things like "antiword -w 200 <file>", it decreases the width > to 138, which seems to be the widest "page" that it believes possible. So > later in the same file, I get the following 138-char-wide chunk: > > |SNIDERMAN, Rebecca ? ? ? ? ? ? ? ? ? ? ? ? ? ?|MB 1794 Brandeis University P O Box ? ? ?| ? ? ? ? ? ? rsnider-1FONPbNgvBv2fBVCVOL8/A at public.gmane.org ? ? ? ? ? ? ?| > | ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?|549110 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? | ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? | > | ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?|Brandeis University ? ? ? ? ? ? ? ? ? ? ?| ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? | > | ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?|Waltham, MA ?02454-9110 ? ? ? ? ? ? ? ? ?| ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? | > > Note the bizarre 4-line address, with just "549110" on the second ?line. ? Of > course, the sensible thing would be to remove the first "Brandeis University" > from the address, but that's what's in the file, and there are other ?entries > with ?quite long addresses. ?I tried to write a perl parser that would handle > all the entries in this file and a couple of others, and after ?an ?afternoon > of ?hacking ?at ?it, ?I ?still ?haven't ?quite succeeded. ?Such spurious line > wrapping, including things like splitting ".net" into ".n" and ?"et" ?in ?one > case, can be one of the trickier kinds of damage to fix. > > I wonder if there's a clean fix to this sort of problem? > > (And why a max of 138 chars? ?That's a rather bizarre number.) > > > -- > ? _' > ? O > ?<:#/> ?John Chambers > ? + ? <jc-8FIgwK2HfyJMuWfdjsoA/w at public.gmane.org> > ?/#\ ?<jc1742-Re5JQEeQqe8AvxtiuMwx3w at public.gmane.org> > ?| | > _______________________________________________ > Discuss mailing list > Discuss-mNDKBlG2WHs at public.gmane.org > http://lists.blu.org/mailman/listinfo/discuss >
BLU is a member of BostonUserGroups | |
We also thank MIT for the use of their facilities. |