Measuring Feature Dependency of Neural Networks by Collapsing Feature Dimensions in the Data Manifold (2024)

Abstract

This paper introduces a new technique to measure the feature dependency of neural network models.The motivation is to better understand a model by querying whether it is using information from human-understandable features, e.g., anatomical shape, volume, or image texture. Our method is based on the principle that if a model is dependent on a feature, then removal of that feature should significantly harm its performance.A targeted feature is “removed” by collapsing the dimension in the data distribution that corresponds to that feature. We perform this by moving data points along the feature dimension to a baseline feature value while staying on the data manifold, as estimated by a deep generative model.Then we observe how the model’s performance changes on the modified test data set, with the target feature dimension removed.We test our method on deep neural network models trained on synthetic image data with known ground truth, an Alzheimer’s disease prediction task using MRI and hippocampus segmentations from the OASIS-3 dataset[1], and a cell nuclei classification task using the Lizard dataset[2].

Copyright ©2024 IEEEPersonal use of this material is permitted. Permission fromIEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material foradvertising or promotional purposes, creating new collectiveworks, for resale or redistribution to servers or lists, or reuseof any copyrighted component of this work in other works.Published in: 2024 IEEE 21st International Symposium on Biomedical Imaging (ISBI)

1 Introduction

Deep neural networks (DNNs) have shown great success in many medical imaging tasks but lack transparency in their decision-making processes. This is particularly problematic in medical applications of deep learning for at least two reasons. First, for a deep learning system to be trustworthy when making health care decisions, it is critical that its decision rules be explainable and plausible to a medical expert. Second, in clinical research it is often important to derive understanding of a biological process. Both scenarios benefit from the ability to explain the features a DNN is using in terms that a human expert can understand. In this work, we are specifically interested in evaluating whether certain target features are indeed used by an existing DNN. This is in contrast to explainability methods that strive to extract features being computed by a DNN and present them in a hopefully interpretable fashion[3, 4, 5, 6, 7, 8]. Rather, we wish to take features that are known to be interpretable—and important in a particular domain—and query if those features are being used by the DNN. For example, it is a well-established fact that Alzheimer’s disease (AD) causes atrophy of the hippocampus. If we are given a DNN that classifies AD patients versus healthy subjects from magnetic resonance imaging (MRI), we would want to know if that classifier is in some way using the volume of a subject’s hippocampi in its decision rule.

To address this problem, one prevailing option is to locally approximate the nonlinear decision boundary with an easier to interpret linear approximation, either in data space[9, 10] or latent space[11]. Yet, most interpretable features of interest are nonlinear functions, which can be lost by linear approximations. Jin et al.[12] tackles non-linearity by interpreting neural networks in terms of the alignment between gradients of a classifier and gradients of a feature along the gradient flow of classifiers. However, the magnitude of the gradient alignment can itself be hard to connect to how important that feature is to the classifier. Closer in nature to our proposed approach, CaCE[13] explores the impact of modifying discrete concepts of the input by utilizing conditional generators. We extend it to continuous features and draw comparison with our proposed method in the experiments.

Our proposed method aims to “remove” a feature from a test dataset and measure how much this negatively affects the test performance of the DNN. When the features are merely subsets of the original input dimensions, they can be easily masked[14]. However, this is often not useful for imaging data, where the original input dimensions correspond to individual pixel/voxel values. When the target feature is a complex, nonlinear function of the input image, removing it from the data, while keeping it valid and realistic, is not trivial. We propose to do this by modeling feature collapse as an integral curve of the target feature’s gradient vector field in the latent space of a generative model that has learned the data distribution. By restricting to the data manifold estimated by the generative model, we ensure that other characterizing features of the data are preserved. We can eliminate a target feature from the test data by manipulating each data point to have a common constant value for this feature (e.g., adjust MR images so that they all have equal hippocampal volume).

Measuring Feature Dependency of Neural Networks by Collapsing Feature Dimensions in the Data Manifold (1)

2 Methods

We consider a neural network classifier as a mapping g:DK:𝑔superscript𝐷superscript𝐾g:\mathbb{R}^{D}\to\mathbb{R}^{K}italic_g : blackboard_R start_POSTSUPERSCRIPT italic_D end_POSTSUPERSCRIPT → blackboard_R start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT, where an input image is considered as a point xD𝑥superscript𝐷x\in\mathbb{R}^{D}italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_D end_POSTSUPERSCRIPT, and the corresponding output is the vector of assigned log class probabilities, lnp(y=kx),k=1,,Kformulae-sequence𝑝𝑦conditional𝑘𝑥𝑘1𝐾\ln p(y=k\mid x),k=1,\ldots,Kroman_ln italic_p ( italic_y = italic_k ∣ italic_x ) , italic_k = 1 , … , italic_K. We define a feature as a differentiable function f:D:𝑓superscript𝐷f:\mathbb{R}^{D}\to\mathbb{R}italic_f : blackboard_R start_POSTSUPERSCRIPT italic_D end_POSTSUPERSCRIPT → blackboard_R. To measure the dependency of the classifier g𝑔gitalic_g on the feature f𝑓fitalic_f, we propose to observe the change in performance of g𝑔gitalic_g, e.g., accuracy, when the feature f𝑓fitalic_f is “removed”. Given the original test dataset X𝑋Xitalic_X, we modify each point in X𝑋Xitalic_X by collapsing the dimension corresponding to the feature f𝑓fitalic_f. The modified test set with the feature f𝑓fitalic_f collapsed is denoted Xf¯subscript𝑋¯𝑓X_{\overline{f}}italic_X start_POSTSUBSCRIPT over¯ start_ARG italic_f end_ARG end_POSTSUBSCRIPT. We then test the classifier g𝑔gitalic_g on this new test dataset and compare the performance on Xf¯subscript𝑋¯𝑓X_{\overline{f}}italic_X start_POSTSUBSCRIPT over¯ start_ARG italic_f end_ARG end_POSTSUBSCRIPT to the original performance on X𝑋Xitalic_X. The dependency of g𝑔gitalic_g on the feature f𝑓fitalic_f is reflected by how much the performance drops.

Collapsing a feature dimension is generally a non-trivial task. If the feature is a linear function of the input, i.e., f(x)=vx𝑓𝑥𝑣𝑥f(x)=v\cdot xitalic_f ( italic_x ) = italic_v ⋅ italic_x, where vD𝑣superscript𝐷v\in\mathbb{R}^{D}italic_v ∈ blackboard_R start_POSTSUPERSCRIPT italic_D end_POSTSUPERSCRIPT is a constant unit vector, then collapsing the feature dimension is simply projection onto the orthogonal complement of v𝑣vitalic_v. In other words, the collapsing operation in the linear case is given byxf¯=x(xv)v.subscript𝑥¯𝑓𝑥𝑥𝑣𝑣x_{\overline{f}}=x-(x\cdot v)v.italic_x start_POSTSUBSCRIPT over¯ start_ARG italic_f end_ARG end_POSTSUBSCRIPT = italic_x - ( italic_x ⋅ italic_v ) italic_v . The resulting collapsed data points will have constant feature value, f(xf¯)=0𝑓subscript𝑥¯𝑓0f(x_{\overline{f}})=0italic_f ( italic_x start_POSTSUBSCRIPT over¯ start_ARG italic_f end_ARG end_POSTSUBSCRIPT ) = 0, so the information from f𝑓fitalic_f is effectively removed. The more general case, where f𝑓fitalic_f is a nonlinear function, is more complicated. We might consider moving data points along integral curves of the gradient of f𝑓fitalic_f until we arrive at a constant value for f𝑓fitalic_f. That is, we integrate the ordinary differential equation

dcdt(t)=f(c(t)),𝑑𝑐𝑑𝑡𝑡𝑓𝑐𝑡\frac{dc}{dt}(t)=\nabla{f}(c(t)),divide start_ARG italic_d italic_c end_ARG start_ARG italic_d italic_t end_ARG ( italic_t ) = ∇ italic_f ( italic_c ( italic_t ) ) ,(1)

with initial conditions c(0)=x𝑐0𝑥c(0)=xitalic_c ( 0 ) = italic_x, and stopping when f(c(t))=b𝑓𝑐𝑡𝑏f(c(t))=bitalic_f ( italic_c ( italic_t ) ) = italic_b for some predetermined baseline value b𝑏bitalic_b. Note that this is analogous to the linear case, where orthogonal projection moves along the constant gradient field, f=v𝑓𝑣\nabla f=v∇ italic_f = italic_v. Also note that the integration of (1) may need to be forward or backward in t𝑡titalic_t, depending on whether the inital feature value, f(x)𝑓𝑥f(x)italic_f ( italic_x ), is above or below the baseline b𝑏bitalic_b.

There is a serious drawback to this strategy of moving a data point along the integral curves of f𝑓\nabla f∇ italic_f, which is that the integral curve may move outside of the data distribution. More specifically, if we think of our data distribution as lying on a lower-dimensional manifold in the data space, the integral curves of f𝑓\nabla f∇ italic_f may leave the data manifold. This is illustrated in Fig.1. Thus, moving along the integral curves of f𝑓\nabla f∇ italic_f may produce invalid data, i.e., data that does not look like realistic samples from the data distribution. As an example, imagine we have a dataset of images of white ellipses on a black background, and we can compute the aspect ratio of the ellipse as our feature f𝑓fitalic_f. As shown on the right side of Fig.1, if we try to change the aspect ratio of an ellipse image directly by moving along the gradient direction of this feature in the ambient image space, then we produce an image that does not look like an ellipse with adjusted aspect ratio. This is because we have moved off of the manifold of valid ellipse images.

Measuring Feature Dependency of Neural Networks by Collapsing Feature Dimensions in the Data Manifold (2)

To handle this, we propose to first train a deep generative model to learn the data distribution and then restrict movements along our feature gradient to remain on the estimated data manifold. In this work we use a variational autoencoder (VAE)[15] with encoder ψ:Dd:𝜓superscript𝐷superscript𝑑\psi:\mathbb{R}^{D}\rightarrow\mathbb{R}^{d}italic_ψ : blackboard_R start_POSTSUPERSCRIPT italic_D end_POSTSUPERSCRIPT → blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT and decoder ϕ:dD:italic-ϕsuperscript𝑑superscript𝐷\phi:\mathbb{R}^{d}\rightarrow\mathbb{R}^{D}italic_ϕ : blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT → blackboard_R start_POSTSUPERSCRIPT italic_D end_POSTSUPERSCRIPT, where dsuperscript𝑑\mathbb{R}^{d}blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT (d<D𝑑𝐷d<Ditalic_d < italic_D) is the latent space. Given a data point xD𝑥superscript𝐷x\in\mathbb{R}^{D}italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_D end_POSTSUPERSCRIPT, we first encode it to produce a latent representation, z=ψ(x)𝑧𝜓𝑥z=\operatorname{\psi}(x)italic_z = italic_ψ ( italic_x ). Now, we proceed with the same strategy to collapse the feature f𝑓fitalic_f by moving along the gradient direction, but now we constrain this to be the gradient of f𝑓fitalic_f restricted to the estimated data manifold. The feature f𝑓fitalic_f restricted to the VAE manifold is given by fϕ𝑓italic-ϕf\circ\operatorname{\phi}italic_f ∘ italic_ϕ, and the integral curve of the gradient is now

dcdt(t)=z(fϕ(c(t)))=Dϕ(c(t))Txf(ϕ(c(t))),\frac{dc}{dt}(t)=\nabla_{z}(f\circ\operatorname{\phi}(c(t)))=D\operatorname{%\phi}(c(t))^{T}\nabla_{x}f(\operatorname{\phi}(c(t))),divide start_ARG italic_d italic_c end_ARG start_ARG italic_d italic_t end_ARG ( italic_t ) = ∇ start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ( italic_f ∘ italic_ϕ ( italic_c ( italic_t ) ) ) = italic_D italic_ϕ ( italic_c ( italic_t ) ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ∇ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT italic_f ( italic_ϕ ( italic_c ( italic_t ) ) ) ,(2)

where c(t)𝑐𝑡c(t)italic_c ( italic_t ) is a curve in the latent space, dsuperscript𝑑\mathbb{R}^{d}blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT, Dϕ𝐷italic-ϕD\operatorname{\phi}italic_D italic_ϕ is the Jacobian matrix forϕitalic-ϕ\operatorname{\phi}italic_ϕ, and zsubscript𝑧\nabla_{z}∇ start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT, xsubscript𝑥\nabla_{x}∇ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT are gradients with respect to a latent z𝑧zitalic_z or data x𝑥xitalic_x, respectively. Note that the multiplication by DϕT𝐷superscriptitalic-ϕ𝑇D\operatorname{\phi}^{T}italic_D italic_ϕ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT in (2) comes from the chain rule and is computed as a backpropagation through the decoder,ϕitalic-ϕ\operatorname{\phi}italic_ϕ.

We start integrating (2) at the encoded input data point, c(0)=z𝑐0𝑧c(0)=zitalic_c ( 0 ) = italic_z, and we integrate (forward or backward) until we reach a desired baseline feature value at some time T𝑇Titalic_T. The end result is a point zf¯=c(T)subscript𝑧¯𝑓𝑐𝑇z_{\overline{f}}=c(T)italic_z start_POSTSUBSCRIPT over¯ start_ARG italic_f end_ARG end_POSTSUBSCRIPT = italic_c ( italic_T ) that lies on the baseline level set for f𝑓fitalic_f, b={zf(ϕ(z))=b}subscript𝑏conditional-set𝑧𝑓italic-ϕ𝑧𝑏\mathcal{L}_{b}=\{z\mid f(\phi(z))=b\}caligraphic_L start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT = { italic_z ∣ italic_f ( italic_ϕ ( italic_z ) ) = italic_b }. This corresponds to a data point with the f𝑓fitalic_f feature collapsed, that is, it produces xf¯=ϕ(zf¯)subscript𝑥¯𝑓italic-ϕsubscript𝑧¯𝑓x_{\overline{f}}=\operatorname{\phi}(z_{\overline{f}})italic_x start_POSTSUBSCRIPT over¯ start_ARG italic_f end_ARG end_POSTSUBSCRIPT = italic_ϕ ( italic_z start_POSTSUBSCRIPT over¯ start_ARG italic_f end_ARG end_POSTSUBSCRIPT ), such that f(xf¯)=b𝑓subscript𝑥¯𝑓𝑏f(x_{\overline{f}})=bitalic_f ( italic_x start_POSTSUBSCRIPT over¯ start_ARG italic_f end_ARG end_POSTSUBSCRIPT ) = italic_b. The overall process is illustrated in Fig.2. In practice, we perform the gradient vector field integration with a discrete Euler integration step, which is summarized in Algorithm1.

0:A data point x𝑥xitalic_x

0:Data point xf¯subscript𝑥¯𝑓x_{{\overline{f}}}italic_x start_POSTSUBSCRIPT over¯ start_ARG italic_f end_ARG end_POSTSUBSCRIPT returned

zψ(x)𝑧𝜓𝑥z\leftarrow\operatorname{\psi}(x)italic_z ← italic_ψ ( italic_x ) and xϕ(z)superscript𝑥italic-ϕ𝑧x^{\prime}\leftarrow\operatorname{\phi}(z)italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ← italic_ϕ ( italic_z )

ssign(f(x)b)𝑠sign𝑓superscript𝑥𝑏s\leftarrow\mathrm{sign}(f(x^{\prime})-b)italic_s ← roman_sign ( italic_f ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) - italic_b )

whiles(f(x)b)>0𝑠𝑓superscript𝑥𝑏0s\cdot(f(x^{\prime})-b)>0italic_s ⋅ ( italic_f ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) - italic_b ) > 0do

vDϕ(z)Tf(x)v\leftarrow D\operatorname{\phi}(z)^{T}\nabla f(x^{\prime})italic_v ← italic_D italic_ϕ ( italic_z ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ∇ italic_f ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT )

zzsαv𝑧𝑧𝑠𝛼𝑣z\leftarrow z-s\alpha\cdot vitalic_z ← italic_z - italic_s italic_α ⋅ italic_v

xϕ(z)superscript𝑥italic-ϕ𝑧x^{\prime}\leftarrow\operatorname{\phi}(z)italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ← italic_ϕ ( italic_z )

endwhile

return xsuperscript𝑥x^{\prime}italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT

Measuring Feature Dependency of Neural Networks by Collapsing Feature Dimensions in the Data Manifold (3)
Measuring Feature Dependency of Neural Networks by Collapsing Feature Dimensions in the Data Manifold (4)

3 Experiments

3.1 Setup

We test our method on three image datasets: a synthetic dataset of binary ellipse images, hippocampi from MRI in OASIS-3[1], and cell nuclei histology images from the Lizard dataset[2].All experiments are implemented using PyTorch[16].For each classification task, we perform 5555-fold cross validation. We evaluate the dependency of each classifier on certain interpretable features using our proposed feature collapse method (Algorithm1) applied to the test set, and compare the results to CaCE scores[13].

The ellipse dataset contains 10,000 grayscale images of white ellipses on black backgrounds with five varying generative factors: x𝑥xitalic_x and y𝑦yitalic_y position, size, rotation angle, and aspect ratio. We assign them to two classes separated by only aspect ratio.For the hippocampi dataset, we cropped a region of interest around the left and right hippocampi in T1w MRI from 925 different subjects in the OASIS-3 dataset[1], masked out the background voxels using Freesurfer segmentations[17] and applied Gaussian blur. The task is to classify AD versus healthy subjects.The cell nuclei histology data is derived from the Lizard dataset[2]. We cropped the images around each nucleus, 2000 from each of the six annotated nuclei types, masked out other areas and applied Gaussian filter.For each dataset we trained a deep convolutional classifier for the task, a plain convolutional VAE to learn the data manifold for our method, and a conditional VAE to implement CaCE.

We have chosen several interpretable features to test for classifier dependency as listed in Table 2. The aspect ratio is calculated by taking the ratio of major and minor eigenvalues of the second-order moments.For every dataset, we also included ten random linear features as baselines.Since CaCE was originally proposed for discrete concepts, we extend it by using a low and a high feature value to represent each feature and generate two sets of random samples using the conditional VAE. Different choices of feature values have been experimented as shown in the Table 2.

Measuring Feature Dependency of Neural Networks by Collapsing Feature Dimensions in the Data Manifold (5)
Measuring Feature Dependency of Neural Networks by Collapsing Feature Dimensions in the Data Manifold (6)
Measuring Feature Dependency of Neural Networks by Collapsing Feature Dimensions in the Data Manifold (7)

3.2 Results

First, we visualize the feature collapsing results in Fig.4 for two examples: hippocampus volume and cell nucleus color hue. Every feature is collapsed to the mean value over the dataset. The altered images remain realistic, while the selected feature is successfully changed.In Figure 5, we illustrate the distributions of volume feature values for three datasets: the original hippocampus data, samples generated by a conditional VAE with volume conditions set to the same baseline value, and samples generated by our method, which collapses the volume feature dimension. Our method demonstrates superior precision in constraining feature values around baseline values compared to the conditional VAE. Specifically, our generated samples consistently exhibit a feature value standard deviation less than 2%percent22\%2 % of the original dataset.In Table 1, we report the performance of the evaluated neural network models on the original dataset, which is averaged over 5 folds.The balanced accuracy is used for the hippocampus dataset. To verify that the reconstruction quality of the VAE is not affecting the classifier performance significantly, we also test the model on the reconstructed dataset, which shows a slight impact on the performance.

Datasetoriginal datareconstructed data
Ellipse0.872 (±plus-or-minus\pm± 0.010)0.854 (±plus-or-minus\pm± 0.013)
Hippocampus0.821 (±plus-or-minus\pm± 0.020)0.816 (±plus-or-minus\pm± 0.028)
Nucleus0.606 (±plus-or-minus\pm± 0.004)0.581 (±plus-or-minus\pm± 0.002)

FeatureAAC (ours) (mean ±plus-or-minus\pm± std)CaCE score
25%/75%percent25percent7525\%/75\%25 % / 75 %5%/95%percent5percent955\%/95\%5 % / 95 %
X-coord0.770 (±plus-or-minus\pm± 0.013)0.1040.0981
Y-coord0.750 (±plus-or-minus\pm± 0.017)0.0110.007
Size0.812 (±plus-or-minus\pm± 0.009)0.3150.47
Aspect ratio0.494 (±plus-or-minus\pm± 0.010)0.9160.989
RLF0.786 (±plus-or-minus\pm± 0.021)--
FeatureAAC (ours) (mean ±plus-or-minus\pm± std)CaCE score
25%/75%percent25percent7525\%/75\%25 % / 75 %5%/95%percent5percent955\%/95\%5 % / 95 %
Volume0.530 (±plus-or-minus\pm± 0.027)0.4070.885
Aspect ratio0.798 (±plus-or-minus\pm± 0.029)0.0760.221
Avg. bright.0.806 (±plus-or-minus\pm± 0.026)0.0940.296
RLF0.816 (±plus-or-minus\pm± 0.025)--
FeatureAAC (ours) (mean ±plus-or-minus\pm± std)
Size0.449 (±plus-or-minus\pm± 0.013)
Aspect ratio0.519 (±plus-or-minus\pm± 0.007)
Saturation0.468 (±plus-or-minus\pm± 0.003)
Hue0.482 (±plus-or-minus\pm± 0.011)
RLF0.524 (±plus-or-minus\pm± 0.010)

Next, we perform the feature dimension collapse and report the accuracy after collapse (AAC) along with CaCE scores on corresponding features in Table 2. The performance drop when collapsed along random linear feature (RLF) dimensions is used as a baseline comparison for our method. In the ellipse dataset, we can see that collapsing the aspect ratio changes the accuracy to around random chance. This is exactly what we would hope because this is the only feature that separates the two classes. Collapsing other features slightly affects the performance, but is similar to or less than RLF, from which we can conclude the model does not depend on them.In the hippocampus experiment, the model depends heavily on volume, almost dropping to random chance when volume is removed. The other features (aspect ratio, brightness) do not substantially influence the classifier. Note that despite the well-known fact that AD reduces hippocampal volume, because of the black-box nature of deep neural networks, we don’t know if volume is a feature that the classifier has learned. However, our AAC measure confirms that volume is an essential feature for the classifier to learn.Finally, for the nucleus dataset, the results suggest size, saturation, and hue features are being used by the classifier to identify cell types (note here that random chance is 1/6 = 0.167).

While CaCE scores generally align with our method regarding the identification of important features, the scale does not necessarily indicate how critical these features are. For instance, in the ellipse dataset, the aspect ratio is the sole distinguishing feature between the two classes; however, the size feature receives a notably high CaCE score.This could be in part due to the conditional VAE’s inability to constrain the feature values effectively as elaborated earlier.Moreover, applying CaCE to continuous features poses challenges, as varying feature values leads to unpredictable impacts. Also, note CaCE is only defined for binary classifiers, and thus we do not compare it to AAC on the multi-class cell nuclei dataset.

Finally, we performed an ablation study for the hippocampus dataset to test the importance of using the VAE model to restrict the feature collapse to the data manifold. We repeated the hippocampus experiment, but with feature collapse directly in the data space, i.e., by integrating gradients of the features using Equation (1) in the ambient data space. The resulting accuracy after collapsing volume is 0.772 (±plus-or-minus\pm± 0.043). And the results for aspect ratio and average brightness are 0.814 (±plus-or-minus\pm± 0.021) and 0.819 (±plus-or-minus\pm± 0.021), respectively. We can see that the performance drop in volume is much less drastic and also much more variable across test/train splits. This indicates that the model’s behavior become more unpredictable when collapsing features in the ambient data space, perhaps due to the images being off of the data manifold.

In conclusion, our method effectively captures classifier feature dependencies, emphasizing the assessment of feature significance rather than mere relevance. It’s important to note that our method necessitates a large sample size for VAE training and relies on access to feature gradients. We plan to address these challenges in future research.

4 Compliance with Ethical Standards

This research study was conducted retrospectively usinghuman subject data made available in open access by LaMontagne et al.[1] and Graham et al.[2].Ethical approval was not required as confirmed by thelicense attached with the open access data.

5 Acknowledgements

This work was partially supported by NSF Smart and Connected Health grant 2205417.

References

  • [1]P.J.LaMontagne etal.,“OASIS-3: longitudinal neuroimaging, clinical, and cognitive dataset for normal aging and Alzheimer disease,”MedRxiv, 2019.
  • [2]S.Graham etal.,“Lizard: A large-scale dataset for colonic nuclear instance segmentation and classification,”in ICCVW, 2021, pp. 684–693.
  • [3]M.Sundararajan etal.,“Axiomatic attribution for deep networks,”in ICML, 2017, pp. 3319–3328.
  • [4]Robin Hesse, Simone Schaub-Meyer, and Stefan Roth,“Fast axiomatic attribution for neural networks,”Advances in Neural Information Processing Systems, vol. 34, pp. 19513–19524, 2021.
  • [5]R.R.Selvaraju etal.,“Grad-cam: Visual explanations from deep networks via gradient-based localization,”in ICCV, 2017, pp. 618–626.
  • [6]S.Bach etal.,“On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation,”PloS one, vol. 10, no. 7, pp. e0130140, 2015.
  • [7]Y.Goyal etal.,“Counterfactual visual explanations,”in ICML, 2019, pp. 2376–2384.
  • [8]K.Kanamori etal.,“DACE: Distribution-aware counterfactual explanation by mixed-integer linear optimization.,”in IJCAI, 2020, pp. 2855–2862.
  • [9]M.T.Ribeiro etal.,“‘Why should I trust you?’ Explaining the predictions of any classifier,”in KDD, 2016, pp. 1135–1144.
  • [10]I.Ahern etal.,“Normlime: A new feature importance metric for explaining deep neural networks,”arXiv preprint arXiv:1909.04200, 2019.
  • [11]B.Kim etal.,“Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (TCAV),”in ICML, 2018, pp. 2668–2677.
  • [12]Y.Jin etal.,“Feature gradient flow for interpreting deep neural networks in head and neck cancer prediction,”in ISBI. IEEE, 2022.
  • [13]Yash Goyal, Amir Feder, Uri Shalit, and Been Kim,“Explaining classifiers with causal concept effect (CaCE),”arXiv preprint arXiv:1907.07165, 2019.
  • [14]Hugh Chen, IanC Covert, ScottM Lundberg, and Su-In Lee,“Algorithms to estimate shapley value feature attributions,”Nature Machine Intelligence, pp. 1–12, 2023.
  • [15]D.P. Kingma and M.Welling,“Auto-encoding variational Bayes,”in ICLR, 2014.
  • [16]A.Paszke etal.,“Automatic differentiation in PyTorch,”2017.
  • [17]B.Fischl etal.,“Whole brain segmentation: automated labeling of neuroanatomical structures in the human brain,”Neuron, vol. 33, no. 3, pp. 341–355, 2002.
Measuring Feature Dependency of Neural Networks by Collapsing Feature Dimensions in the Data Manifold (2024)
Top Articles
Latest Posts
Article information

Author: Nathanial Hackett

Last Updated:

Views: 6335

Rating: 4.1 / 5 (72 voted)

Reviews: 87% of readers found this page helpful

Author information

Name: Nathanial Hackett

Birthday: 1997-10-09

Address: Apt. 935 264 Abshire Canyon, South Nerissachester, NM 01800

Phone: +9752624861224

Job: Forward Technology Assistant

Hobby: Listening to music, Shopping, Vacation, Baton twirling, Flower arranging, Blacksmithing, Do it yourself

Introduction: My name is Nathanial Hackett, I am a lovely, curious, smiling, lively, thoughtful, courageous, lively person who loves writing and wants to share my knowledge and understanding with you.