Home
| Calendar
| Mail Lists
| List Archives
| Desktop SIG
| Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU |
On Wednesday 01 October 2003 08:29 pm, dan moylan wrote: > does anyone know of a utility to extract text from a .pdf > file? this would be much as open-office extracts text > from a .doc file. and that's a very useful thing > too. > > the text information is there, since the find utility > in xpdf can find text strings in the document. > however, i've looked and found no utility such as the > one i need. > > any suggestions would be appreciated. /usr/bin/pdftotext comes part of xpdf. Yes, it's very cool. When you need a conversion program, a good tactic is to look in all the bin directories for a file named the source format, followed by "2" or "to", followed by the destination format. This is what I did: [david]$ locate pdf | grep bin | grep -E '(text|txt|asc|ascii)' | more /usr/bin/pdftotext [david at uni Windweb]$ rpm -qf /usr/bin/pdftotext xpdf-1.00-7 rpm -qf means "query the package that provides this file". ------------------------------------------------------------------- DDDD David Kramer http://thekramers.net DK KD There is an art, it says, or, rather, a knack to flying. The DKK D knack lies in learning how to throw yourself at the ground DK KD and miss. All it requires is simply the ability to throw DDDD yourself forward with all weight, and the willingness not to mind that it's going to hurt. That is, it's going to hurt if you fail to miss the ground. Douglas Adams, "Hitchhiker's Guide to the Galaxy".
BLU is a member of BostonUserGroups | |
We also thank MIT for the use of their facilities. |