Machine Learning / Confusion Matrix

Confusion Matrix

By Marcelo Fernandes Dez 7, 2017

Confusion Matrix

Confusion matrix is a way of measuring how good our model is. The main idea behind this evaluation metric is to understand how good our model is when dealing with false-positives and false-negatives.

In order to understand how it works, we are going to use a simple Linear Regression example.

In the graph below, we have the positive entries that are marked in blue, and we have the negative entries which are marked in orange:



We can easily check that our Regression Model is not perfect, once it does not separate the positives from the negatives.

We have 7 positive entries and 8 negative entries, our line tries to create a division between those two entries, but it misses 1 positive entry (It's a false-negative) that is under the decision line, and it also misses 3 negative entries (They are a false-positive) that are above the decision line.


Therefore:

  • Out of 7 positive entries, we got 6 right classifications: 6/7 = 85.71% (True Positives)
  • Out of 7 positive entries, we got 1 wrong classification: 1/7 = 14.28% (False Negatives)

  • Out of 8 negative entries, we got 5 right classifications: 5/8 = 62.50% (True Negatives)
  • Out of 8 negative entries, we got 3 wrong classifications: 3/8 = 37.50% (False Positives)

We get as a result, the following table:



As we see, The diagonal elements represent the number of points for which the predicted label is equal to the true label, while off-diagonal elements are those that are mislabeled by the classifier


Notes