Computer Vision Interpret¶

vision.interpret is the module that implements custom Interpretation classes for different vision tasks by inheriting from it.

Let's show how SegmentationInterpretation can be used once we train a segmentation model.

train¶

camvid = untar_data(URLs.CAMVID_TINY)
path_lbl = camvid/'labels'
path_img = camvid/'images'

codes = np.loadtxt(camvid/'codes.txt', dtype=str)
get_y_fn = lambda x: path_lbl/f'{x.stem}_P{x.suffix}'

data = (SegmentationItemList.from_folder(path_img)
        .split_by_rand_pct()
        .label_from_func(get_y_fn, classes=codes)
        .transform(get_transforms(), tfm_y=True, size=128)
        .databunch(bs=16, path=camvid)
        .normalize(imagenet_stats))

data.show_batch(rows=2, figsize=(7,5))

learn = unet_learner(data, models.resnet18)
learn.fit_one_cycle(3,1e-2)
learn.save('mini_train')

interpret¶

interp = SegmentationInterpretation.from_learner(learn)

Since FlattenedLoss of CrossEntropyLoss() is used we reshape and then take the mean of pixel losses per image. In order to do so we need to pass sizes:tuple to top_losses()

top_losses, top_idxs = interp.top_losses(sizes=(128,128))

(tensor([3.3195, 3.1692, 2.6574, 2.5976, 2.4910, 2.3759, 2.3710, 2.2064, 2.0871,
         2.0834, 2.0479, 1.8645, 1.8412, 1.7956, 1.7013, 1.6126, 1.6015, 1.5470,
         1.4495, 1.3423]),
 tensor([12,  4, 17, 13, 19, 18,  7,  8, 10,  1, 15,  0,  2,  9, 16, 11, 14,  5,
          6,  3]))

Next, we can generate a confusion matrix similar to what we usually have for classification. Two confusion matrices are generated: mean_cm which represents the global label performance and single_img_cm which represents the same thing but for each individual image in dataset.

Values in the matrix are calculated as:

\begin{align} \ CM_{ij} & = IOU(Predicted , True | True) \\ \end{align}

Or in plain english: ratio of pixels of predicted label given the true pixels

learn.data.classes

array(['Animal', 'Archway', 'Bicyclist', 'Bridge', 'Building', 'Car', 'CartLuggagePram', 'Child', 'Column_Pole',
       'Fence', 'LaneMkgsDriv', 'LaneMkgsNonDriv', 'Misc_Text', 'MotorcycleScooter', 'OtherMoving', 'ParkingBlock',
       'Pedestrian', 'Road', 'RoadShoulder', 'Sidewalk', 'SignSymbol', 'Sky', 'SUVPickupTruck', 'TrafficCone',
       'TrafficLight', 'Train', 'Tree', 'Truck_Bus', 'Tunnel', 'VegetationMisc', 'Void', 'Wall'], dtype='<U17')

mean_cm, single_img_cm = interp._generate_confusion()

((32, 32), (20, 32, 32))

_plot_intersect_cm first displays a dataframe showing per class score using the IOU definition we made earlier. These are the diagonal values from the confusion matrix which is displayed after.

NaN indicate that these labels were not present in our dataset, in this case validation set. As you can imagine it also helps you to maybe construct a better representing validation set.

df = interp._plot_intersect_cm(mean_cm, "Mean of Ratio of Intersection given True Label")

Next let's look at the single worst prediction in our dataset. It looks like this dummy model just predicts everything as Road :)

i = top_idxs[0]
df = interp._plot_intersect_cm(single_img_cm[i], f"Ratio of Intersection given True Label, Image:{i}")

Finally we will visually inspect this single prediction

interp.show_xyz(i, sz=15)

{'Animal': 0,
 'Archway': 1,
 'Bicyclist': 2,
 'Bridge': 3,
 'Building': 4,
 'Car': 5,
 'CartLuggagePram': 6,
 'Child': 7,
 'Column_Pole': 8,
 'Fence': 9,
 'LaneMkgsDriv': 10,
 'LaneMkgsNonDriv': 11,
 'Misc_Text': 12,
 'MotorcycleScooter': 13,
 'OtherMoving': 14,
 'ParkingBlock': 15,
 'Pedestrian': 16,
 'Road': 17,
 'RoadShoulder': 18,
 'Sidewalk': 19,
 'SignSymbol': 20,
 'Sky': 21,
 'SUVPickupTruck': 22,
 'TrafficCone': 23,
 'TrafficLight': 24,
 'Train': 25,
 'Tree': 26,
 'Truck_Bus': 27,
 'Tunnel': 28,
 'VegetationMisc': 29,
 'Void': 30,
 'Wall': 31}

epoch	train_loss	valid_loss	time
0	10.024513	3.442348	00:15
1	6.325253	2.343699	00:03
2	4.759998	2.108100	00:02

Deprecated: This is v1 of fastai, which is not supported.

vision.learner

Computer Vision Interpret¶

`class` `SegmentationInterpretation`[source][test]

`top_losses`[source][test]

`_interp_show`[source][test]

`show_xyz`[source][test]

`_generate_confusion`[source][test]

`_plot_intersect_cm`[source][test]

train¶

interpret¶

`class` `ObjectDetectionInterpretation`[source][test]

label	score
Sky	0.851616
Road	0.793361
Building	0.274023
Tree	0.00469498
Void	6.70092e-05
Animal	0
Pedestrian	0
VegetationMisc	0
Truck_Bus	0
TrafficLight	0
SUVPickupTruck	0
SignSymbol	0
Sidewalk	0
ParkingBlock	0
Archway	0
OtherMoving	0
Misc_Text	0
LaneMkgsDriv	0
Fence	0
Column_Pole	0
Child	0
CartLuggagePram	0
Car	0
Bicyclist	0
Wall	0
Bridge	NaN
LaneMkgsNonDriv	NaN
MotorcycleScooter	NaN
RoadShoulder	NaN
TrafficCone	NaN
Train	NaN
Tunnel	NaN

label	score
Road	0.999367
Sky	0.405882
Building	0.0479275
Tree	0.00365813
Bicyclist	0
Void	0
TrafficLight	0
SUVPickupTruck	0
Sidewalk	0
Pedestrian	0
OtherMoving	0
Misc_Text	0
LaneMkgsDriv	0
Column_Pole	0
CartLuggagePram	0
Car	0
Wall	0
Animal	NaN
Archway	NaN
Bridge	NaN
Child	NaN
Fence	NaN
LaneMkgsNonDriv	NaN
MotorcycleScooter	NaN
ParkingBlock	NaN
RoadShoulder	NaN
SignSymbol	NaN
TrafficCone	NaN
Train	NaN
Truck_Bus	NaN
Tunnel	NaN
VegetationMisc	NaN

Deprecated: This is v1 of fastai, which is not supported.

vision.learner

Computer Vision Interpret¶

class SegmentationInterpretation[source][test]

top_losses[source][test]

_interp_show[source][test]

show_xyz[source][test]

_generate_confusion[source][test]

_plot_intersect_cm[source][test]

train¶

interpret¶

class ObjectDetectionInterpretation[source][test]

`class` `SegmentationInterpretation`[source][test]

`top_losses`[source][test]

`_interp_show`[source][test]

`show_xyz`[source][test]

`_generate_confusion`[source][test]

`_plot_intersect_cm`[source][test]

`class` `ObjectDetectionInterpretation`[source][test]