[GRLUG] ebook identification

Michael Passer michael.passer at gmail.com
Fri Jun 24 10:45:23 EDT 2011


One way I can think of off-hand to do that sort of thing absent metadata
would be comparing statistically unlikely passages with those in a text
database (*a la* turnitin)  -- but there would be no way to do that entirely
client-side and would have to be backed by an entity big enough to fight off
the inevitable copyright claims.

On Fri, Jun 24, 2011 at 10:40 AM, Benjamin Flanders <flanderb at gmail.com>wrote:

> On Fri, Jun 24, 2011 at 10:03 AM, David Pembrook <david at pembrook.net>
> wrote:
> > Checkout calibre. Its cross platform and looks interesting. I've just
> tried
> > it and it did a nice job cataloging my collection. It even grabs amazon
> > descriptions and book covers. I have not delved into all its features. It
> > has a server component but I've only tried it as a desktop application.
> See
> > http://calibre-ebook.com/.
> >
> > Quoting their about page:
> >
> > calibre is a free and open source e-book library management application
> > developed by users of e-books for users of e-books. It has a cornucopia
> of
> > features divided into the following main categories:
> >
> > Library Management
> > E-book conversion
> > Syncing to e-book reader devices
> > Downloading news from the web and converting it into e-book form
> > Comprehensive e-book viewer
> > Content server for online access to your book collection
> >
> > Dave
>
> I'm actually setting up a Calibre Server and this is what caused my
> inquiry.    Calbre has the ability to download the book information if
> you already have some sort of meta-data on the book, author and/or
> title, but it is really finicky as would be the case for most easily
> implemented text comparisons.  It does have the ability, on import,
> to grab some metadata from file types that have metadata, like PDFs,
> or from the file name, for those that don't have metadata. but this is
> finicky and I have to mess with the Regex if the file name is in a
> different order than the previous book I imported.
>
>
>
>
>
> Share and Enjoy
> Ben
>
>
>
>
>
> >
> > On 6/24/2011 9:33 AM, Benjamin Flanders wrote:
> >
> > On Fri, Jun 24, 2011 at 9:17 AM, John-Thomas Richards <jtr at jrichards.org
> >
> > wrote:
> >
> > On Fri, Jun 24, 2011 at 06:50:32AM -0400, Benjamin Flanders wrote:
> >
> > Not totally Linux related, but I thought one of you might know.  Is
> > there a program for ebook identification?  I'm thinking along the
> > lines of Musicbrainz PUID audio signature, but for books.  I would
> > think it would be easier for ebooks than music since there is no
> > compression and a word is a word, but I am coming up with nothing on
> > Google. I keep coming up with e-books about fuzzy logic, isbns, tree
> > identification, signature analysis, and fingerprinting.
> >
> > Wait.  ebooks aren't compressed?  Isn't plain text about the most
> > compressible thing around, and lossless at that?  This surprises me.
> >
> > I guess I should have not used the word "compressed".  I was going for
> > the term lossless and had a brain bump.  Sorry.
> >
> > Anyway, I would have thought the application would have been out there
> > already .
> >
> >
> >
> > --
> > john-thomas
> > ------
> > None are more hopelessly enslaved than those who falsely believe they are
> > free.
> > Johann Wolfgang van Goethe, novelist and philosopher (1749-1832)
> >
> > --
> > This message has been scanned for viruses and
> > dangerous content by MailScanner, and is
> > believed to be clean.
> >
> > _______________________________________________
> > grlug mailing list
> > grlug at grlug.org
> > http://shinobu.grlug.org/cgi-bin/mailman/listinfo/grlug
> >
> >
> > --
> > This message has been scanned for viruses and
> > dangerous content by MailScanner, and is
> > believed to be clean.
> > _______________________________________________
> > grlug mailing list
> > grlug at grlug.org
> > http://shinobu.grlug.org/cgi-bin/mailman/listinfo/grlug
> >
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
> _______________________________________________
> grlug mailing list
> grlug at grlug.org
> http://shinobu.grlug.org/cgi-bin/mailman/listinfo/grlug
>

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://shinobu.grlug.org/pipermail/grlug/attachments/20110624/ed53997b/attachment-0001.html>


More information about the grlug mailing list