Academic Research

Here are some highlights of my academic interests. See the complete list of my publications below or on Google Scholar

Computational Argumentation and Debating

Research in computational argumentation focuses on understanding and generating arguments, with applications in debating systems and listening comprehension.

Selected Publications

Personalized Machine Translation

Research in personalized machine translation aims to adapt translation systems to individual users' preferences and characteristics.

Selected Publications

Model-aware Improvement of Source Translatability

Research in improving source translatability focuses on modifying source texts to enhance translation quality.

Selected Publications

Academic Service

Program Committee Member

ACL 2024 (Area Chair) // W-NUT 2022 // ARR April 2022 / ACL 2022 (ARR) // W-NUT 2021 // EMNLP 2021 // EACL 2021 // COLING 2020 // *SEM 2020 // EMNLP 2020 // ACL 2020 // LREC 2020 // W-NUT 2019 // ACL 2019 // NLP+CSS 2019 // COLING 2018 // ACL 2018 // NAACL 2018 // EMNLP 2017 // *SEM 2017 // ACL 2017 // Journal of Natural Language Engineering (JNLE) 2016 // COLING 2016 // LREC 2016 // EMNLP 2016 // *SEM 2016 // EMNLP 2015 // *SEM 2015 // CICLING 2015 // Journal of Language Resources and Evaluation (LREV) 2014 //EMNLP 2014 // COLING 2014 // WMT 2014 // LREC 2014 // WMT 2013 // Journal of Language Resources and Evaluation (LREV) 2013 // IJCNLP 2013 // *SEM 2013 // Journal of Computer Science and Technology (JCST) 2013 // WMT 2012 // EACL 2012 // LREC 2012 // ACM TIST Journal, Special Issue on Paraphrasing 2011 // EMNLP 2011 // TextInfer 2011 // COLING 2010 // EMNLP 2009 // AAAI 2008

All Publications

2024

All languages matter: Evaluating lmms on culturally diverse 100 languages

Vayani, Ashmal; Dissanayake, Dinura; Watawana, Hasindri; Ahsan, Noor; Sasikumar, Nevasini; Thawakar, Omkar; Ademtew, Henok Biadglign; Hmaiti, Yahya; Kumar, Amandeep; Kuckreja, Kartik

arXiv preprint arXiv:2411.16508

2023

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Danchev, V; Nikoulina, Vassilina; Laippala, Veronika; Lepercq, Violette; Prabhu, Vrinda; Alyafeai, Zaid; Talat, Zeerak; Raja, Arun; Heinzerling, Benjamin; Si, Chenglei

2022

Bloom: A 176b-parameter open-access multilingual language model

Scao, Teven Le; Fan, Angela; Akiki, Christopher; Pavlick, Ellie; Ilić, Suzana; Hesslow, Daniel; Castagné, Roman; Luccioni, Alexandra Sasha; Yvon, François; Gallé, Matthias

arXiv preprint arXiv:2211.05100

Emergent Structures and Training Dynamics in Large Language Models

Teehan, Ryan; Clinciu, Miruna; Serikov, Oleg; Szczechla, Eliza; Seelam, Natasha; Mirkin, Shachar; Gokaslan, Aaron

Challenges & Perspectives in Creating Large Language Models

2021

An autonomous debating system

Slonim, Noam; Bilu, Yonatan; Alzate, Carlos; Bar-Haim, Roy; Bogin, Ben; Bonin, Francesca; Choshen, Leshem; Cohen-Karlik, Edo; Dankin, Lena; Edelstein, Lilach

Nature , Volume 591 (7850) , pp. 379-384

2019

A Dataset of General-Purpose Rebuttal

Orbach, Matan; Bilu, Yonatan; Gera, Ariel; Kantor, Yoav; Dankin, Lena; Lavee, Tamar; Kotlerman, Lili; Mirkin, Shachar; Jacovi, Michal; Aharonov, Ranit

arXiv preprint arXiv:1909.00393

Towards effective rebuttal: Listening comprehension using corpus-wide claim mining

Lavee, Tamar; Orbach, Matan; Kotlerman, Lili; Kantor, Yoav; Gretz, Shai; Dankin, Lena; Mirkin, Shachar; Jacovi, Michal; Bilu, Yonatan; Aharonov, Ranit

arXiv preprint arXiv:1907.11889

2018

Listening comprehension over argumentative content

Mirkin, Shachar; Moshkowich, Guy; Orbach, Matan; Kotlerman, Lili; Kantor, Yoav; Lavee, Tamar; Jacovi, Michal; Bilu, Yonatan; Aharonov, Ranit; Slonim, Noam

Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing , pp. 719-724

What did you mention? a large scale mention detection benchmark for spoken and written text

Mass, Yosi; Kotlerman, Lili; Mirkin, Shachar; Venezian, Elad; Witzling, Gera; Slonim, Noam

arXiv preprint arXiv:1801.07507

System and method for predicting an optimal machine translation system for a user based on an updated user profile

Mirkin, Shachar; Meunier, Jean-Luc

2017

A recorded debating dataset

Mirkin, Shachar; Jacovi, Michal; Lavee, Tamar; Kuo, Hong-Kwang; Thomas, Samuel; Sager, Leslie; Kotlerman, Lili; Venezian, Elad; Slonim, Noam

arXiv preprint arXiv:1709.06438

Joint Learning of Correlated Sequence Labelling Tasks Using Bidirectional Recurrent Neural Networks

Pahuja, Vardaan; Laha, Anirban; Mirkin, Shachar; Raykar, Vikas; Kotlerman, Lili; Lev, Guy

Interspeech 2017

Personalized Machine Translation: Preserving Original Author Traits

Rabinovich, Ella; Mirkin, Shachar; Patel, Raj Nath; Specia, Lucia; Wintner, Shuly

EACL 2017

2016

Learning generation templates from dialog transcripts

Sriram Venkatapathy, Shachar Mirkin, Marc Dymetman

Method and system for summarizing a document

Gupta, Anand; Kaur, Manpreet; Mirkin, Shachar

System and method for incrementally updating a reordering model for a statistical machine translation system

Mirkin, Shachar

2015

Motivating Personality-aware Machine Translation

Mirkin, Shachar; Nowson, Scott; Brun, Caroline; Perez, Julien

The 2015 Conference on Empirical Methods on Natural Language Processing (EMNLP)

Personalized machine translation: Predicting translational preferences

Mirkin, Shachar; Meunier, Jean-Luc

Proceedings of the 2015 conference on empirical methods in natural language processing , pp. 2019-2025

XRCE personal language analytics engine for multilingual author profiling

Nowson, Scott; Perez, Julien; Brun, Caroline; Mirkin, Shachar; Roux, Claude

Working Notes Papers of the CLEF , pp. 1412-1424

Semantic refining of cross-lingual information retrieval results

Mirkin, Shachar; Lagos, Nikolaos; Calapodescu, loan

Refining inference rules with temporal event clustering

Jacquet, Guillaume; Mirkin, Shachar

Machine translation-driven authoring system and method

Venkatapathy, Sriram; Mirkin, Shachar

2014

Confidence-driven rewriting of source texts for improved translation

Mirkin, Shachar; Venkatapathy, Sriram; Dymetman, Marc

Incrementally Updating the SMT Reordering Model

Mirkin, Shachar

Proceedings of The 28th Pacific Asia Conference on Language, Information and Computing (PACLIC), Phuket, Thailand

Comparison of data selection techniques for the translation of video lectures

Wuebker, Joern; Ney, Hermann; Martínez-Villaronga, Adrià; Giménez Pastor, Adrián; Juan Císcar, Alfonso; Servan, Christophe; Dymetman, Marc; Mirkin, Shachar

Data Selection for Compact Adapted SMT Models

Mirkin, Shachar; Besacier, Laurent

Proceedings of AMTA

Text summarization through entailment-based minimum vertex cover

Gupta, Anand; Kaur, Manpreet; Mirkin, Shachar; Singh, Adarsh; Goyal, Aseem

Proceedings of the Third Joint Conference on Lexical and Computational Semantics (* SEM 2014) , pp. 75-80

2013

Confidence-driven Rewriting for Improved Translation

Mirkin, Shachar; Venkatapathy, Sriram; Dymetman, Marc

MT Summit

Assessing quick update methods of statistical translation models

Mirkin, Shachar; Cancedda, Nicola

Proceedings of the International Workshop of Spoken Language Translation (IWSLT) , pp. 264-271

Error prediction with partial feedback

Darling, William; Archambeau, Cédric; Mirkin, Shachar; Bouchard, Guillaume

Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2013, Prague, Czech Republic, September 23-27, 2013, Proceedings, Part II 13 , pp. 80-94

SORT: An Interactive Source-Rewriting Tool for Improved Translation

Mirkin, Shachar; Venkatapathy, Sriram; Dymetman, Marc; Calapodescu, Ioan

ACL Demos

2012

An SMT-driven authoring tool

Venkatapathy, Sriram; Mirkin, Shachar

Proceedings of COLING 2012: Demonstration Papers , pp. 459-466

2011

Context and Discourse in Textual Entailment Inference

Mirkin, Shachar

Knowledge and Tree-Edits in Learnable Entailment Proofs

Stern, Asher; Mirkin, Shachar; Shnarch, Eyal; Kotlerman, Lili; Dagan, Ido; Lotan, Amnon; Berant, Jonathan

TAC

Classification-based contextual preferences

Mirkin, Shachar; Dagan, Ido; Kotlerman, Lili; Szpektor, Idan

Proceedings of the TextInfer 2011 Workshop on Textual Entailment , pp. 20-29

2010

Rule Chaining and Approximate Match in textual inference

Stern, Asher; Shnarch, Eyal; Mirkin, Shachar; Kotlerman, Lili; Zeichner, Naomi; Dagan, Ido; Lotan, Amnon; Berant, Jonathan

TAC

Learning an expert from human annotations in statistical machine translation: The case of out-of-vocabulary words

Aziz, Wilker; Dymetman, Marc; Specia, Lucia; Mirkin, Shachar

Proceedings of the 14th Annual Conference of the European Association for Machine Translation

Recognising entailment within discourse

Mirkin, Shachar; Berant, Jonathan; Dagan, Ido; Shnarch, Eyal

Proceedings of the 23rd International Conference on Computational Linguistics , pp. 770-778

A Resource for Investigating the Impact of Anaphora and Coreference on Inference

Abad, Azad; Bentivogli, Luisa; Dagan, Ido; Giampiccolo, Danilo; Mirkin, Shachar; Pianta, Emanuele; Stern, Asher

LREC

Assessing the role of discourse references in entailment inference

Mirkin, Shachar; Dagan, Ido; Padó, Sebastian

Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics , pp. 1209-1219

2009

Addressing Discourse and Document Structure in the RTE Search Task

Mirkin, Shachar; Bar-Haim, Roy; Dagan, Ido; Shnarch, Eyal; Stern, Asher; Szpektor, Idan; Berant, Jonathan

TAC

Evaluating the inferential utility of lexical-semantic resources

Mirkin, Shachar; Dagan, Ido; Shnarch, Eyal

Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009) , pp. 558-566

Source-language entailment modeling for translating unknown terms

Mirkin, Shachar; Specia, Lucia; Cancedda, Nicola; Dagan, Ido; Dymetman, Marc; Szpektor, Idan

Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP , pp. 791-799

2008

Efficient semantic deduction and approximate matching over compact parse forests

Bar-Haim, Roy; Berant, Jonathan; Dagan, Ido; Greental, Iddo; Mirkin, Shachar; Shnarch, Eyal; Szpektor, Idan

Proceedings of TAC

2006

Integrating pattern-based and distributional similarity methods for lexical entailment acquisition

Mirkin, Shachar; Dagan, Ido; Geffet, Maayan

Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions , pp. 579-586

Contextual Preferences for Name-based Text Categorization

Mirkin, Shachar; Dagan, Ido; Kotlerman, Lili; Szpektor, Idan