Interesting example of the activation atlas in action:
> These atlases not only reveal nuanced visual abstractions within a model, but they can also reveal high-level misunderstandings. For example, by looking at an activation atlas for a "great white shark" we water and triangular fins (as expected) but we also see something that looks like a baseball. This hints at a shortcut taken by this research model where it conflates the red baseball stitching with the open mouth of a great white shark.
> We can test this by using a patch of an image of a baseball to switch the model's classification of a particular image from "grey whale" to "great white shark".
Interesting example of the activation atlas in action:
> These atlases not only reveal nuanced visual abstractions within a model, but they can also reveal high-level misunderstandings. For example, by looking at an activation atlas for a "great white shark" we water and triangular fins (as expected) but we also see something that looks like a baseball. This hints at a shortcut taken by this research model where it conflates the red baseball stitching with the open mouth of a great white shark.
> We can test this by using a patch of an image of a baseball to switch the model's classification of a particular image from "grey whale" to "great white shark".