Neural network uses deep learning to categorize books by genre.
The idiom “never judge a book by its cover” warns against evaluating something purely by the way it looks. And yet book covers are designed to give readers an idea of the content, to make them want to pick up a book and read it. Good book covers are designed to be judged.
And humans are quite good at it. It’s relatively straightforward to pick out a cookery book or a biography or a travel guide just by looking at the cover.
And that raises an interesting question: can machines judge books by their covers, too?
Today we get an answer thanks to the work of Brian Kenji Iwana and Seiichi Uchida at Kyushu University in Japan. These guys have trained a deep neural network to study book covers and determine the category of book they come from. Their published research is available here.
Their method is straightforward. Iwana and Uchida downloaded 137,788 unique book covers from Amazon.com along with the genre of book. There are 20 possible genres but where a book was listed in more than one category, the researchers used just the first.
Next, the pair used 80 percent of the data set to train a neural network to recognize the genre by looking at the cover image. Their neural network has four layers, each with up to 512 neurons, which together learn to recognize the correlation between cover design and genre. The pair used a further 10 percent of the dataset to validate the model and then tested the neural network on the final 10 percent to see how well it categorizes covers it has never seen.
The results make for interesting reading. The algorithm listed the correct genre in its top 3 choices over 40 percent of the time and found the exact genre more than 20 percent of the time. That’s significantly better than chance. “This shows that classification of book cover designs is possible, although a very difficult task,” say Iwana and Uchida.
Some categories turn out to be easier to recognize than others. For example, travel books and books about computer and technology are relatively easy for the neural network to spot because book designers consistently use similar images and design for these genres.
The neural net also found that cookbooks were easy to recognize if they used pictures of food but were entirely ambiguous if they used a different design such as a picture of the chef.
Biographies and memoires were also problematic with the algorithm often selecting history as the category. Interestingly, for many of these books, history is the secondary genre listed on Amazon, suggesting that the algorithm wasn’t entirely bamboozled.
The algorithm also confused children’s books with comics and graphic novels as well as medical books and science books. Perhaps that’s also understandable given the similarities between these categories.
There is one shortcoming in this work. Iwana and Uchida have not compared the performance of their neural network against humans’ ability to recognize book genres by their covers. That would be an interesting experiment and one that would be relatively straightforward to do with an online crowdsourcing service such as Amazon’s Mechanical Turk.
Until that work is done, there is no way of knowing whether machines are any better at this task than humans. Â Although, no matter how good humans are at this task, it is surely only a matter of time before machines outperform them.
Nevertheless, this is interesting work that could help designers improve their skills when it comes to book covers. A more likely outcome, however, is that it could be used to train machines to design book covers without the need for human input. And that means book cover design is just another job that is set to be consigned to the history books.
For another example of AIs one-upping humans, find out how this A.I. uses deep learning to beat humans at DOOM.