The study aims to overcome the missing transparency and interpretability of neural network predictions.
The authors propose a modular framework which learns to extract key phrases for sentiment analysis and text classification.
Although restricted to NLP problems in this study, the basic approach of extracting the most relevant features for accurate prediction might be very useful in more general settings.
Prediction without justification has limited applicability. As a remedy, we learn to extract pieces of input text as justifications – rationales – that are tailored to be short and coherent, yet sufficient for making the same prediction. Our approach combines two modular components, generator and encoder, which are trained to operate well together. The generator specifies a distribution over text fragments as candidate rationales and these are passed through the encoder for prediction. Rationales are never given during training. Instead, the model is regularized by desiderata for rationales. We evaluate the approach on multi-aspect sentiment analysis against manually annotated test cases. Our approach outperforms attention-based baseline by a significant margin. We also successfully illustrate the method on the question retrieval task.
Click here to read more.