Preprints

”+” denotes equal contribution.

Jack Merullo, Noah A. Smith, Sarah Wiegreﬀe+ and Yanai Elazar+. On Linear Representations and Pretraining Data Frequency in Language Models.

Sarah Wiegreﬀe, Oyvind Tafjord, Yonatan Belinkov, Hannaneh Hajishirzi, Ashish Sabhar- wal. Answer, Assemble, Ace: Understanding How Transformers Answer Multiple Choice Questions. PDF, code

Publications

Faeze Brahman, Sachin Kumar, Vidhisha Balachandran+ and Pradeep Dasigi+ and Valentina Pyatkin+ and Abhilasha Ravichander+ and Sarah Wiegreﬀe+, Nouha Dziri, Khyathi Chandu, Jack Hessel, Yulia Tsvetkov, Noah A. Smith, Yejin Choi, Hannaneh Hajishirzi. The Art of Saying No: Contextual Noncompliance in Language Models. NeurIPS 2024 Datasets and Benchmarks. PDF, code & models, data

Naomi Saphra+ and Sarah Wiegreﬀe+. Mechanistic? BlackBoxNLP workshop at EMNLP 2024. PDF

Shramay Palta, Nishant Balepur, Peter A. Rankel, Sarah Wiegreﬀe, Marine Carpuat, Rachel Rudinger. Plausibly Problematic Questions in Multiple-Choice Benchmarks for Commonsense Reasoning. Findings of EMNLP 2024. PDF

Yanai Elazar, Bhargavi Paranjape+ and Hao Peng+ and Sarah Wiegreffe+, Khyathi Raghavi Chandu, Vivek Srikumar, Sameer Singh, Noah A. Smith. Measuring and Improving Attentiveness to Partial Inputs with Counterfactuals. Findings of EMNLP 2024. PDF

Peter Hase, Mohit Bansal, Peter Clark, Sarah Wiegreffe. The Unreasonable Effectiveness of Easy Training Data for Hard Tasks. ACL 2024. PDF, code, blogpost, conference video

Sarah Wiegreffe, Matthew Finlayson, Oyvind Tafjord, Peter Clark, Ashish Sabharwal. Increasing Probability Mass on Answer Choices Does Not Always Improve Accuracy. EMNLP 2023. PDF, code, conference video

Anshita Gupta+, Debanjan Mondal+, Akshay Krishna Sheshadri+, Wenlong Zhao, Xiang Lorraine Li+ and Sarah Wiegreffe+ and Niket Tandon+. Editing Common Sense in Transformers. EMNLP 2023. PDF, code, conference video

Aman Madaan, Niket Tandon, Prakhar Gupta, Skyler Hallinan, Luyu Gao, Sarah Wiegreffe, Uri Alon, Nouha Dziri, Shrimai Prabhumoye, Yiming Yang, Shashank Gupta, Bodhisattwa Prasad Majumder, Katherine Hermann, Sean Welleck, Amir Yazdanbakhsh, Peter Clark. Self-Refine: Iterative Refinement with Self-Feedback. NeurIPS 2023. PDF, code, website, poster

Kaige Xie, Sarah Wiegreffe, Mark Riedl. Calibrating Trust of Multi-Hop Question Answering Systems with Decompositional Probes. Findings of EMNLP 2022. PDF, conference video

Xiangyu Peng+, Siyan Li+, Sarah Wiegreffe, Mark Riedl. Inferring the Reader: Guiding Automated Story Generation with Commonsense Reasoning. Findings of EMNLP 2022. PDF, code, conference video

Sarah Wiegreffe. Interpreting Neural Networks *for* and *with* Natural Language. PhD Dissertation. PDF

Sarah Wiegreffe, Jack Hessel, Swabha Swayamdipta, Mark Riedl, Yejin Choi. Reframing Human-AI Collaboration for Generating Free-Text Explanations. NAACL 2022. PDF, code, conference video

Sarah Wiegreffe+ and Ana Marasović+. Teach Me to Explain: A Review of Datasets for Explainable Natural Language Processing. NeurIPS 2021 Datasets and Benchmarks. PDF, companion website, conference video

Sarah Wiegreffe, Ana Marasović, Noah A. Smith. Measuring Association Between Labels and Free-Text Rationales. EMNLP 2021. PDF, code, conference video

Sarthak Jain, Sarah Wiegreffe, Yuval Pinter, Byron Wallace. Learning to Faithfully Rationalize by Construction. ACL 2020. PDF, code, conference video

Sarah Wiegreffe+ and Yuval Pinter+. Attention is not not Explanation. EMNLP 2019. PDF, code, conference talk, talk slides, non-technical blogpost

Sarah Wiegreffe, Gerardo Flores, Edward Choi, Andrew Dai. Learning Bi-Directional Clinical Event Representations: A Comparison of Architectures. Preprint (available upon request). 2019.

Sarah Wiegreffe, Edward Choi, Sherry Yan, Jimeng Sun, Jacob Eisenstein. Clinical Concept Extraction for Document-Level Coding. BioNLP Workshop at ACL 2019. PDF, poster

James Mullenbach, Sarah Wiegreffe, Jon Duke, Jimeng Sun, Jacob Eisenstein. Explainable Prediction of Medical Codes from Clinical Text. NAACL 2018. PDF, code, talk slides (Google Drive)

Sarah Wiegreffe, Paul Anderson, Jihad Obeid. Can Classifications of Publications by Translational Categories be Automated? American Medical Informatics Association (AMIA) Joint Summits on Translational Science 2017. PDF, poster