While training a model with Caffe deep learning framework it is very easy and useful to evaluate how the model performs on an independent test set. This is very important to test whether the model is overfitting or underfitting, and by adding an Accuracy Layer it lets us know how the model performs in a classification task.
But sometimes the accuracy metric is not enough, so I coded a very simple python layer for Caffe that replaces the accuracy layer and prints a confusion matrix, to have a slightly deeper understanding of what our classification model is doing right (or wrong). You can find the code in Github.
The usage of this python layer is very easy. First of all Caffe has to be compiled to support python layers, check out this post in @chrischoy blog to learn how to compile Caffe with support for python layers and learn more about them. Once Caffe is build with support for python layers, the usage of the layer is very simple, it just has to be used as an accuracy layer in the prototxt file like:
layer {
type: 'Python'
name: 'py_accuracy'
top: 'py_accuracy'
bottom: 'ip2'
bottom: 'label'
python_param {
# the module name -- usually the filename -- that needs to be in $PYTHONPATH
module: 'python_confmat'
# the layer name -- the class name in the module
layer: 'PythonConfMat'
# this is the number of test iterations, it must be the same as defined in the solver.
param_str: '{"test_iter":100}'
}
include {
phase: TEST
}
}
There is a working example in the examples
folder of the Github repo, which must be copied in caffe/examples
folder in order for the relative paths to work. The file python_confmat.py
must be copied in caffe/examples/mnist
to work for the example, but for your own usage you can place it anywhere as long as the path is included in your $PYTHONPATH
.
The confusion matrix is printed to console and looks like this:
Confusion Matrix | Accuracy
------------------------------------------------------------------------
3438 166 191 16 45 9 136 0 | 85.93 %
191 3306 177 1 69 2 15 0 | 87.90 %
88 114 3205 34 431 46 80 3 | 80.10 %
30 12 98 3735 78 23 24 0 | 93.38 %
11 28 437 29 3196 65 45 11 | 83.62 %
3 0 64 7 38 3702 8 0 | 96.86 %
59 4 79 42 44 5 3234 1 | 93.25 %
2 0 29 3 113 9 6 2639 | 94.22 %
Number of test samples: 29676
Here you have a nice example on how to use a Python Layer for Caffe to create a confusion matrix during training, I hope it is useful and feel free to use anywhere you need it.