”+” denotes equal contribution.
Jack Merullo, Noah A. Smith, Sarah Wiegreffe+ and Yanai Elazar+. On Linear Representations and Pretraining Data Frequency in Language Models.
Sarah Wiegreffe, Oyvind Tafjord, Yonatan Belinkov, Hannaneh Hajishirzi, Ashish Sabhar- wal. Answer, Assemble, Ace: Understanding How Transformers Answer Multiple Choice Questions. PDF, code
Publications
Faeze Brahman, Sachin Kumar, Vidhisha Balachandran+ and Pradeep Dasigi+ and Valentina Pyatkin+ and Abhilasha Ravichander+ and Sarah Wiegreffe+, Nouha Dziri, Khyathi Chandu, Jack Hessel, Yulia Tsvetkov, Noah A. Smith, Yejin Choi, Hannaneh Hajishirzi. The Art of Saying No: Contextual Noncompliance in Language Models. NeurIPS 2024 Datasets and Benchmarks. PDF, code & models, data
Naomi Saphra+ and Sarah Wiegreffe+. Mechanistic? BlackBoxNLP workshop at EMNLP 2024. PDF
Shramay Palta, Nishant Balepur, Peter A. Rankel, Sarah Wiegreffe, Marine Carpuat, Rachel Rudinger. Plausibly Problematic Questions in Multiple-Choice Benchmarks for Commonsense Reasoning. Findings of EMNLP 2024. PDF
Yanai Elazar, Bhargavi Paranjape+ and Hao Peng+ and Sarah Wiegreffe+, Khyathi Raghavi Chandu, Vivek Srikumar, Sameer Singh, Noah A. Smith. Measuring and Improving Attentiveness to Partial Inputs with Counterfactuals. Findings of EMNLP 2024. PDF
Peter Hase, Mohit Bansal, Peter Clark, Sarah Wiegreffe. The Unreasonable Effectiveness of Easy Training Data for Hard Tasks. ACL 2024. PDF, code, blogpost, conference video
Sarah Wiegreffe, Matthew Finlayson, Oyvind Tafjord, Peter Clark, Ashish Sabharwal. Increasing Probability Mass on Answer Choices Does Not Always Improve Accuracy. EMNLP 2023. PDF, code, conference video
Anshita Gupta+, Debanjan Mondal+, Akshay Krishna Sheshadri+, Wenlong Zhao, Xiang Lorraine Li+ and Sarah Wiegreffe+ and Niket Tandon+. Editing Common Sense in Transformers. EMNLP 2023. PDF, code, conference video
Aman Madaan, Niket Tandon, Prakhar Gupta, Skyler Hallinan, Luyu Gao, Sarah Wiegreffe, Uri Alon, Nouha Dziri, Shrimai Prabhumoye, Yiming Yang, Shashank Gupta, Bodhisattwa Prasad Majumder, Katherine Hermann, Sean Welleck, Amir Yazdanbakhsh, Peter Clark. Self-Refine: Iterative Refinement with Self-Feedback. NeurIPS 2023. PDF, code, website, poster
Kaige Xie, Sarah Wiegreffe, Mark Riedl. Calibrating Trust of Multi-Hop Question Answering Systems with Decompositional Probes. Findings of EMNLP 2022. PDF, conference video
Xiangyu Peng+, Siyan Li+, Sarah Wiegreffe, Mark Riedl. Inferring the Reader: Guiding Automated Story Generation with Commonsense Reasoning. Findings of EMNLP 2022. PDF, code, conference video
Sarah Wiegreffe. Interpreting Neural Networks *for* and *with* Natural Language. PhD Dissertation. PDF
Sarah Wiegreffe, Jack Hessel, Swabha Swayamdipta, Mark Riedl, Yejin Choi. Reframing Human-AI Collaboration for Generating Free-Text Explanations. NAACL 2022. PDF, code, conference video
Sarah Wiegreffe+ and Ana Marasović+. Teach Me to Explain: A Review of Datasets for Explainable Natural Language Processing. NeurIPS 2021 Datasets and Benchmarks. PDF, companion website, conference video
Sarah Wiegreffe, Ana Marasović, Noah A. Smith. Measuring Association Between Labels and Free-Text Rationales. EMNLP 2021. PDF, code, conference video
Sarthak Jain, Sarah Wiegreffe, Yuval Pinter, Byron Wallace. Learning to Faithfully Rationalize by Construction. ACL 2020. PDF, code, conference video
Sarah Wiegreffe+ and Yuval Pinter+. Attention is not not Explanation. EMNLP 2019. PDF, code, conference talk, talk slides, non-technical blogpost
Sarah Wiegreffe, Gerardo Flores, Edward Choi, Andrew Dai. Learning Bi-Directional Clinical Event Representations: A Comparison of Architectures. Preprint (available upon request). 2019.
Sarah Wiegreffe, Edward Choi, Sherry Yan, Jimeng Sun, Jacob Eisenstein. Clinical Concept Extraction for Document-Level Coding. BioNLP Workshop at ACL 2019. PDF, poster
James Mullenbach, Sarah Wiegreffe, Jon Duke, Jimeng Sun, Jacob Eisenstein. Explainable Prediction of Medical Codes from Clinical Text. NAACL 2018. PDF, code, talk slides (Google Drive)
Sarah Wiegreffe, Paul Anderson, Jihad Obeid. Can Classifications of Publications by Translational Categories be Automated? American Medical Informatics Association (AMIA) Joint Summits on Translational Science 2017. PDF, poster