The traditional process of diagnosing model behaviors in deployment settings involves labor-intensive data acquisition and annotation. Our proposed method, DrML, can discover high-error data slices, identify influential attributes and further rectify undesirable model behaviors, without requiring any visual data. Through a combination of theoretical explanation and empirical verification, we present conditions under which classifiers trained on embeddings from one modality can be equivalently applied to embeddings from another modality.
Yuhui Zhang, Jeff Z HaoChen, Mars (Shih-Cheng) Huang, Kuan-Chieh Wang, James Zou, Serena Yeung