BLU Discuss list archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Fw: What laser printers do you like - Ricoh & Linux

Subject: Fw: What laser printers do you like - Ricoh & Linux
From: david at thekramers.net (David Kramer)
Date: Sat, 15 Jul 2006 10:42:23 -0400
In-reply-to: <E1G1lA6-0006kg-00@vanzandt.comcast.net>
References: <001e01c6a6cc$ac3fcf40$0600a8c0@SAVIN.RFG.COM> <44B7AFE7.5050206@zuken.com> <E1G1lA6-0006kg-00@vanzandt.comcast.net>

James R. Van Zandt wrote:
> I have put together a sizable collection of IEEE papers, but they're
> image-only PDFs, making them hard to search.
> 
> Is there a convenient way to add the metadata to the PDF files
> themselves, along with (say) a hand-typed abstract and OCR of the
> rest, so the whole thing can be indexed by something like beagle
> <http://beaglewiki.org/Main_Page>?  
> 
>               - Jim Van Zandt

I would start by running pdftotext on them, then using regular
expressions to pull metadata out of the text versions.

Oddly enough, this is the basis of one of the projects I'm working on at
 Aptima.  Pulling metadata from information coming from many sources in
many formats, tracking the metadata, and grouping documents into that
metadata.

Follow-Ups:
- Fw: What laser printers do you like - Ricoh & Linux
  - From: john.abreau at zuken.com (John Abreau)

References:
- Fw: What laser printers do you like - Ricoh & Linux
  - From: vince.mchugh at yahoo.com (Yahoo Mail)
- Fw: What laser printers do you like - Ricoh & Linux
  - From: john.abreau at zuken.com (John Abreau)
- Fw: What laser printers do you like - Ricoh & Linux
  - From: jrvz at comcast.net (James R. Van Zandt)

Prev by Date: Fw: What laser printers do you like - Ricoh & Linux
Next by Date: Dual Boot While Hybernating?
Previous by thread: Fw: What laser printers do you like - Ricoh & Linux
Next by thread: Fw: What laser printers do you like - Ricoh & Linux
Index(es):
- Date
- Thread

Boston Linux & Unix / webmaster@blu.org