Content based image classification with the bag of visual words model in Python

Even with ever growing interest in deep learning I still find myself using the bag of visual word approach, if only to have a familiar baseline to test my new fancy algorithms against. I especially like the BoW demo script from the VLFeat team, that reaches a solid 65% accuracy on the, admittedly outdated, Caltech101 dataset. The script has the advantage that it is contains all the usual steps in one script (feature extraction, training of the classifier and evaluation of the whole pipeline) and that it can also be easily adapted to other datasets.

The only problem was, that it is a Matlab script and Matlab licences are in my experience often scarce due to their high price even for research institutes. So I rewrote the script in Python using the uncomplete VLFeat Python wrapper.

You can find my code as usual on github: https://github.com/shackenberg/phow_caltech101.py

In case you are just diving into the world of BoW I recommend my minimal BoW image classifier code, which might be easier to understand.

Book: Programming Computer Vision with Python

Book cover: Programming Computer Vision with Python

In case anyone missed it, you can download a very mature draft of “Programming Computer Vision with Python” at programmingcomputervision.com. This book takes a fresh approach at introducing people to the computer vision field. It is aimed at beginners, who have some programming experience (not necessary Python) and basic understanding of linear algebra (matrices and vectors) and analysis.

The covered topics are (as taken from the TOC):

  • Basic Image Handling and Processing
  • Local Image Descriptors
  • Image to Image Mappings
  • Camera Models and Augmented Reality
  • Clustering Images
  • Searching Images
  • Classifying Image Content
  • Image Segmentation
  • OpenCV

What I like the most are the mini project like programming your own little augmented reality app or building a complete web app for content based image search. It is always great to have little working demos to show to your friends. I will definitely recommend it to anyone new and interested in the computer vision field.

The author, Jan Erik Solem, is Associate Professor at Lunds Universitet and co-founder of PolarRose, a facial recognition software company which was bought by Apple in September 2010.

You can buy the book from July on at Amazon.com or download, as said, the draft from the book’s website.

Who is collaborating?

Collaboration graph of master thesis created with collabgraph

In my scarce spare time, I have written Collabgraph to visualize connections between authors of scientific publications.

This python script reads a (your) bibtex file and draws a graph in which all the nodes are authors and the edges represent that the two authors have collaborated (or at least wrote a paper together).

On the right is the graph created by from the references used in my diploma thesis.  You can immediately see what central role Eakins, Meier and Flickner played.
Collabgraph requires only the pygraphviz library, which can installed with “easy_install pygraphviz”.

You can find the sourcode and the example at bitbucket.org.

I am looking forward to your feedback!!!