Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Blog | Linux Links | Bling | About BLU

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

pdf/txt conversion



On Wednesday 01 October 2003 08:29 pm, dan moylan wrote:
> does anyone know of a utility to extract text from a .pdf
> file?  this would be much as open-office extracts text
> from a .doc file.  and that's a very useful thing
> too.
>
> the text information is there, since the find utility
> in xpdf can find text strings in the document.
> however, i've looked and found no utility such as the
> one i need.
>
> any suggestions would be appreciated.

/usr/bin/pdftotext comes part of xpdf.

Yes, it's very cool.

When you need a conversion program, a good tactic is to look in all the bin 
directories for a file named  the source format, followed by "2" or "to", 
followed by the destination format. 

This is what I did: 

[david]$ locate pdf | grep bin | grep -E '(text|txt|asc|ascii)' | more
/usr/bin/pdftotext

[david at uni Windweb]$ rpm -qf /usr/bin/pdftotext
xpdf-1.00-7

rpm -qf means "query the package that provides this file".
-------------------------------------------------------------------
DDDD   David Kramer                   http://thekramers.net
DK KD  There is an art, it says, or, rather, a knack to flying.  The
DKK D  knack lies in learning how to throw yourself at the ground 
DK KD  and miss.  All it requires is simply the ability to throw
DDDD   yourself forward with all weight, and the willingness not to 
       mind that it's going to hurt.  That is, it's going to hurt if
       you fail to miss the ground.
                  Douglas Adams, "Hitchhiker's Guide to the Galaxy".




BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org