pdf/txt conversion
David Kramer
david at thekramers.net
Wed Oct 1 20:43:25 EDT 2003
On Wednesday 01 October 2003 08:29 pm, dan moylan wrote:
> does anyone know of a utility to extract text from a .pdf
> file? this would be much as open-office extracts text
> from a .doc file. and that's a very useful thing
> too.
>
> the text information is there, since the find utility
> in xpdf can find text strings in the document.
> however, i've looked and found no utility such as the
> one i need.
>
> any suggestions would be appreciated.
/usr/bin/pdftotext comes part of xpdf.
Yes, it's very cool.
When you need a conversion program, a good tactic is to look in all the bin
directories for a file named the source format, followed by "2" or "to",
followed by the destination format.
This is what I did:
[david]$ locate pdf | grep bin | grep -E '(text|txt|asc|ascii)' | more
/usr/bin/pdftotext
[david at uni Windweb]$ rpm -qf /usr/bin/pdftotext
xpdf-1.00-7
rpm -qf means "query the package that provides this file".
-------------------------------------------------------------------
DDDD David Kramer http://thekramers.net
DK KD There is an art, it says, or, rather, a knack to flying. The
DKK D knack lies in learning how to throw yourself at the ground
DK KD and miss. All it requires is simply the ability to throw
DDDD yourself forward with all weight, and the willingness not to
mind that it's going to hurt. That is, it's going to hurt if
you fail to miss the ground.
Douglas Adams, "Hitchhiker's Guide to the Galaxy".
More information about the Discuss
mailing list