I'm learning about XAI and I have a question about the derivative of the network. Assume I have a CNN model which gives 4 output representing 4 classes, and I have one target layer (L
) from which I want to extract information when I pass the image through model. When I take the derivative of 1 output respect to L
, I get a gradient matrix which has the same shape as the feature map. So what does that matrix represent for?
Ex: Feature map at L
has shape [256, 40, 40]
so does the gradient matrix.
model(I) ---> [p1, p2, p3, p4]
p4.backward()