One way I can think of off-hand to do that sort of thing absent metadata would be comparing statistically unlikely passages with those in a text database (<i>a la</i> turnitin)  -- but there would be no way to do that entirely client-side and would have to be backed by an entity big enough to fight off the inevitable copyright claims.<br>

<br><div class="gmail_quote">On Fri, Jun 24, 2011 at 10:40 AM, Benjamin Flanders <span dir="ltr"><<a href="mailto:flanderb@gmail.com">flanderb@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

<div class="im">On Fri, Jun 24, 2011 at 10:03 AM, David Pembrook <<a href="mailto:david@pembrook.net">david@pembrook.net</a>> wrote:<br>

> Checkout calibre. Its cross platform and looks interesting. I've just tried<br>

> it and it did a nice job cataloging my collection. It even grabs amazon<br>

> descriptions and book covers. I have not delved into all its features. It<br>

> has a server component but I've only tried it as a desktop application.  See<br>

> <a href="http://calibre-ebook.com/" target="_blank">http://calibre-ebook.com/</a>.<br>

><br>

> Quoting their about page:<br>

><br>

> calibre is a free and open source e-book library management application<br>

> developed by users of e-books for users of e-books. It has a cornucopia of<br>

> features divided into the following main categories:<br>

><br>

> Library Management<br>

> E-book conversion<br>

> Syncing to e-book reader devices<br>

> Downloading news from the web and converting it into e-book form<br>

> Comprehensive e-book viewer<br>

> Content server for online access to your book collection<br>

><br>

> Dave<br>

<br>

</div>I'm actually setting up a Calibre Server and this is what caused my<br>

inquiry.    Calbre has the ability to download the book information if<br>

you already have some sort of meta-data on the book, author and/or<br>

title, but it is really finicky as would be the case for most easily<br>

implemented text comparisons.  It does have the ability, on import,<br>

to grab some metadata from file types that have metadata, like PDFs,<br>

or from the file name, for those that don't have metadata. but this is<br>

finicky and I have to mess with the Regex if the file name is in a<br>

different order than the previous book I imported.<br>

<br>

<br>

<br>

<br>

<br>

Share and Enjoy<br>

Ben<br>

<div><div></div><div class="h5"><br>

<br>

<br>

<br>

<br>

><br>

> On 6/24/2011 9:33 AM, Benjamin Flanders wrote:<br>

><br>

> On Fri, Jun 24, 2011 at 9:17 AM, John-Thomas Richards <<a href="mailto:jtr@jrichards.org">jtr@jrichards.org</a>><br>

> wrote:<br>

><br>

> On Fri, Jun 24, 2011 at 06:50:32AM -0400, Benjamin Flanders wrote:<br>

><br>

> Not totally Linux related, but I thought one of you might know.  Is<br>

> there a program for ebook identification?  I'm thinking along the<br>

> lines of Musicbrainz PUID audio signature, but for books.  I would<br>

> think it would be easier for ebooks than music since there is no<br>

> compression and a word is a word, but I am coming up with nothing on<br>

> Google. I keep coming up with e-books about fuzzy logic, isbns, tree<br>

> identification, signature analysis, and fingerprinting.<br>

><br>

> Wait.  ebooks aren't compressed?  Isn't plain text about the most<br>

> compressible thing around, and lossless at that?  This surprises me.<br>

><br>

> I guess I should have not used the word "compressed".  I was going for<br>

> the term lossless and had a brain bump.  Sorry.<br>

><br>

> Anyway, I would have thought the application would have been out there<br>

> already .<br>

><br>

><br>

><br>

> --<br>

> john-thomas<br>

> ------<br>

> None are more hopelessly enslaved than those who falsely believe they are<br>

> free.<br>

> Johann Wolfgang van Goethe, novelist and philosopher (1749-1832)<br>

><br>

> --<br>

> This message has been scanned for viruses and<br>

> dangerous content by MailScanner, and is<br>

> believed to be clean.<br>

><br>

> _______________________________________________<br>

> grlug mailing list<br>

> <a href="mailto:grlug@grlug.org">grlug@grlug.org</a><br>

> <a href="http://shinobu.grlug.org/cgi-bin/mailman/listinfo/grlug" target="_blank">http://shinobu.grlug.org/cgi-bin/mailman/listinfo/grlug</a><br>

><br>

><br>

> --<br>

> This message has been scanned for viruses and<br>

> dangerous content by MailScanner, and is<br>

> believed to be clean.<br>

> _______________________________________________<br>

> grlug mailing list<br>

> <a href="mailto:grlug@grlug.org">grlug@grlug.org</a><br>

> <a href="http://shinobu.grlug.org/cgi-bin/mailman/listinfo/grlug" target="_blank">http://shinobu.grlug.org/cgi-bin/mailman/listinfo/grlug</a><br>

><br>

<br>

--<br>

This message has been scanned for viruses and<br>

dangerous content by MailScanner, and is<br>

believed to be clean.<br>

<br>

_______________________________________________<br>

grlug mailing list<br>

<a href="mailto:grlug@grlug.org">grlug@grlug.org</a><br>

<a href="http://shinobu.grlug.org/cgi-bin/mailman/listinfo/grlug" target="_blank">http://shinobu.grlug.org/cgi-bin/mailman/listinfo/grlug</a><br>

</div></div></blockquote></div><br>

<br />-- 

<br />This message has been scanned for viruses and

<br />dangerous content by

<a href="http://www.mailscanner.info/"><b>MailScanner</b></a>, and is

<br />believed to be clean.