2 thoughts on “if I had to find duplicates in book catalogs with entries with different titles but with the same content..

  1. What we can do is to have more attributes listed with every book, like the number of chapters, the list of chapters(in order), and check for similarity in them. At the highest level I think this is possible.

  2. What is your similarity function? So, that means you need to compare one to one i.e. O(n^2) algo ? isn’t it?

    For an application like google print, there are some millions of books. If you run an n^2 algo, it will take some years to find similar books. isn’t it?

    Pa1

Leave a Reply

Your email address will not be published. Required fields are marked *