Reconsidering evaluation data sets

In this blog post I want to share some interessting articles which deal with data sets in computer vision. For starters, in this blog post Tomasz Malisiewicz draws attention to a video lecture by Peter Norvig (Google) in which Mr Norvig showed some interesting results

where algorithms that obtained the best performance on a small dataset no longer did the best when the size of the training set was increased by an order of magnitude. … Also, the mediocre algorithms in the small training size regime often outperformed their more complicated counterparts once more data was utilized.

This is indeed interesting as it is always hard to say how much training and test data is necessary and most scientist, me as well, a far more interested in working on their precious algorithm instead of collecting a solid ground truth. Furthermore, as I pointed out in a comment for Tomasz’ blog post, using 10 times as many pictures would mean, I could only evaluate 3 feature combinations in the time I could have evaluated 30.

Answering to my question on how to handle that trade-off, he advocates nonparametric* approaches and

combining learning with data-driven approaches to reduce test time complexity.

I agree with him, that we definitely should spent more time and effort creating larger groundtruth sets, instead of optimizing our algorithms for a groundtruth that is too small to reveal anything.

For further reading I refer to Prof. Jain’s Blog, where he claims in his blog post, Evaluating Multimedia Algorithms, that the existing data sets for photo retrieval are

too small such as the Corel or Pascal datasets, too specific like the TRECVID dataset, or without ground truth, such as the several recent efforts by MIT and MSRA that gathered millions of Web images for testing,

and promotes his concept for gathering controlled data ground truths.

As the third read the Scienceblog features a story about a James DiCarlo, a neuroscientist in the McGovern Institute for Brain Research at MIT and graduates students Nicolas Pinto and David Cox of the Rowland Harvard Institute who

argue that natural photographic image sets, like the widely used Caltech101 database, have design flaws that enable computers to succeed where they would fail with more authentically varied images. For example, photographers tend to center objects in a frame and to prefer certain views and contexts. The visual system, by contrast, encounters objects in a much broader range of conditions.

They go on

We suspected that the supposedly natural images in current computer vision tests do not really engage the central problem of variability, and that our intuitions about what makes objects hard or easy to recognize are incorrect.”

I think all the three articles remind us, to reconsider the data sets we use for evaluation. Regarding their size,noisiness and their ‘naturality’.

* nonparametric as in using rank or order of the images


2 thoughts on “Reconsidering evaluation data sets

  1. This post was great back in 2009, and I want to point out that since then these “dataset issues” have been also addressed academically, in particular by professors Antonio Torralba of MIT and Alyosha Efros of CMU. These great computer vision visionaries wrote a CVPR 2011 paper, “Unbiased Look at Dataset Bias,” which really highlights the problems with these evaluation datasets.

    By training objects detectors/classifiers on one dataset, and evaluating them on other datasets, they empirically demonstrated dataset bias — detectors simply did not generalize well across datasets.

    See the project page and PDF paper at the following url:

    For anybody serious about object recognition research, the paper is a must-read.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s