Computer Vision Interpret¶
vision.interpret
is the module that implements custom Interpretation
classes for different vision tasks by inheriting from it.
Let's show how SegmentationInterpretation
can be used once we train a segmentation model.
train¶
camvid = untar_data(URLs.CAMVID_TINY)
path_lbl = camvid/'labels'
path_img = camvid/'images'
codes = np.loadtxt(camvid/'codes.txt', dtype=str)
get_y_fn = lambda x: path_lbl/f'{x.stem}_P{x.suffix}'
data = (SegmentationItemList.from_folder(path_img)
.split_by_rand_pct()
.label_from_func(get_y_fn, classes=codes)
.transform(get_transforms(), tfm_y=True, size=128)
.databunch(bs=16, path=camvid)
.normalize(imagenet_stats))
data.show_batch(rows=2, figsize=(7,5))
learn = unet_learner(data, models.resnet18)
learn.fit_one_cycle(3,1e-2)
learn.save('mini_train')
interpret¶
interp = SegmentationInterpretation.from_learner(learn)
Since FlattenedLoss of CrossEntropyLoss()
is used we reshape and then take the mean of pixel losses per image. In order to do so we need to pass sizes:tuple
to top_losses()
top_losses, top_idxs = interp.top_losses(sizes=(128,128))
Next, we can generate a confusion matrix similar to what we usually have for classification. Two confusion matrices are generated: mean_cm
which represents the global label performance and single_img_cm
which represents the same thing but for each individual image in dataset.
Values in the matrix are calculated as:
\begin{align} \ CM_{ij} & = IOU(Predicted , True | True) \\ \end{align}Or in plain english: ratio of pixels of predicted label given the true pixels
learn.data.classes
mean_cm, single_img_cm = interp._generate_confusion()
_plot_intersect_cm
first displays a dataframe showing per class score using the IOU definition we made earlier. These are the diagonal values from the confusion matrix which is displayed after.
NaN
indicate that these labels were not present in our dataset, in this case validation set. As you can imagine it also helps you to maybe construct a better representing validation set.
df = interp._plot_intersect_cm(mean_cm, "Mean of Ratio of Intersection given True Label")
Next let's look at the single worst prediction in our dataset. It looks like this dummy model just predicts everything as Road
:)
i = top_idxs[0]
df = interp._plot_intersect_cm(single_img_cm[i], f"Ratio of Intersection given True Label, Image:{i}")
Finally we will visually inspect this single prediction
interp.show_xyz(i, sz=15)