Academic Research
Here are some highlights of my academic interests. See the complete list of my publications below or on Google Scholar
Computational Argumentation and Debating
Research in computational argumentation focuses on understanding and generating arguments, with applications in debating systems and listening comprehension.
Selected Publications
- Noam Slonim et al. An autonomous debating system. Nature 591, 379–384 (2021).
- Shachar Mirkin et al. Listening Comprehension over Argumentative Content. EMNLP 2018.
- Shachar Mirkin et al. A Recorded Debating Dataset. LREC 2018.
Personalized Machine Translation
Research in personalized machine translation aims to adapt translation systems to individual users' preferences and characteristics.
Selected Publications
- Ella Rabinovich et al. Personalized Machine Translation Preserving Original Author Traits. EACL 2017.
- Shachar Mirkin and Jean-Luc Meunier. Personalized machine translation: Predicting translational preferences. EMNLP 2015.
Model-aware Improvement of Source Translatability
Research in improving source translatability focuses on modifying source texts to enhance translation quality.
Selected Publications
- Shachar Mirkin et al. Confidence-driven Rewriting for Improved Translation. MT Summit 2013.
- Sriram Venkatapathy and Shachar Mirkin. An SMT-driven Authoring Tool. COLING 2012.
Academic Service
Program Committee Member
ACL 2024 (Area Chair) // W-NUT 2022 // ARR April 2022 / ACL 2022 (ARR) // W-NUT 2021 // EMNLP 2021 // EACL 2021 // COLING 2020 // *SEM 2020 // EMNLP 2020 // ACL 2020 // LREC 2020 // W-NUT 2019 // ACL 2019 // NLP+CSS 2019 // COLING 2018 // ACL 2018 // NAACL 2018 // EMNLP 2017 // *SEM 2017 // ACL 2017 // Journal of Natural Language Engineering (JNLE) 2016 // COLING 2016 // LREC 2016 // EMNLP 2016 // *SEM 2016 // EMNLP 2015 // *SEM 2015 // CICLING 2015 // Journal of Language Resources and Evaluation (LREV) 2014 //EMNLP 2014 // COLING 2014 // WMT 2014 // LREC 2014 // WMT 2013 // Journal of Language Resources and Evaluation (LREV) 2013 // IJCNLP 2013 // *SEM 2013 // Journal of Computer Science and Technology (JCST) 2013 // WMT 2012 // EACL 2012 // LREC 2012 // ACM TIST Journal, Special Issue on Paraphrasing 2011 // EMNLP 2011 // TextInfer 2011 // COLING 2010 // EMNLP 2009 // AAAI 2008
All Publications
2024
All languages matter: Evaluating lmms on culturally diverse 100 languages
arXiv preprint arXiv:2411.16508
2023
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
2022
Bloom: A 176b-parameter open-access multilingual language model
arXiv preprint arXiv:2211.05100
Emergent Structures and Training Dynamics in Large Language Models
Challenges & Perspectives in Creating Large Language Models
2021
An autonomous debating system
Nature , Volume 591 (7850) , pp. 379-384
2019
A Dataset of General-Purpose Rebuttal
arXiv preprint arXiv:1909.00393
Towards effective rebuttal: Listening comprehension using corpus-wide claim mining
arXiv preprint arXiv:1907.11889
2018
Listening comprehension over argumentative content
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing , pp. 719-724
What did you mention? a large scale mention detection benchmark for spoken and written text
arXiv preprint arXiv:1801.07507
System and method for predicting an optimal machine translation system for a user based on an updated user profile
2017
A recorded debating dataset
arXiv preprint arXiv:1709.06438
Joint Learning of Correlated Sequence Labelling Tasks Using Bidirectional Recurrent Neural Networks
Interspeech 2017
Personalized Machine Translation: Preserving Original Author Traits
EACL 2017
2016
Learning generation templates from dialog transcripts
Method and system for summarizing a document
System and method for incrementally updating a reordering model for a statistical machine translation system
2015
Motivating Personality-aware Machine Translation
The 2015 Conference on Empirical Methods on Natural Language Processing (EMNLP)
Personalized machine translation: Predicting translational preferences
Proceedings of the 2015 conference on empirical methods in natural language processing , pp. 2019-2025
XRCE personal language analytics engine for multilingual author profiling
Working Notes Papers of the CLEF , pp. 1412-1424
Semantic refining of cross-lingual information retrieval results
Refining inference rules with temporal event clustering
Machine translation-driven authoring system and method
2014
Confidence-driven rewriting of source texts for improved translation
Incrementally Updating the SMT Reordering Model
Proceedings of The 28th Pacific Asia Conference on Language, Information and Computing (PACLIC), Phuket, Thailand
Comparison of data selection techniques for the translation of video lectures
Data Selection for Compact Adapted SMT Models
Proceedings of AMTA
Text summarization through entailment-based minimum vertex cover
Proceedings of the Third Joint Conference on Lexical and Computational Semantics (* SEM 2014) , pp. 75-80
2013
Confidence-driven Rewriting for Improved Translation
MT Summit
Assessing quick update methods of statistical translation models
Proceedings of the International Workshop of Spoken Language Translation (IWSLT) , pp. 264-271
Error prediction with partial feedback
Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2013, Prague, Czech Republic, September 23-27, 2013, Proceedings, Part II 13 , pp. 80-94
SORT: An Interactive Source-Rewriting Tool for Improved Translation
ACL Demos
2012
An SMT-driven authoring tool
Proceedings of COLING 2012: Demonstration Papers , pp. 459-466
2011
Context and Discourse in Textual Entailment Inference
Knowledge and Tree-Edits in Learnable Entailment Proofs
TAC
Classification-based contextual preferences
Proceedings of the TextInfer 2011 Workshop on Textual Entailment , pp. 20-29
2010
Rule Chaining and Approximate Match in textual inference
TAC
Learning an expert from human annotations in statistical machine translation: The case of out-of-vocabulary words
Proceedings of the 14th Annual Conference of the European Association for Machine Translation
Recognising entailment within discourse
Proceedings of the 23rd International Conference on Computational Linguistics , pp. 770-778
A Resource for Investigating the Impact of Anaphora and Coreference on Inference
LREC
Assessing the role of discourse references in entailment inference
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics , pp. 1209-1219
2009
Addressing Discourse and Document Structure in the RTE Search Task
TAC
Evaluating the inferential utility of lexical-semantic resources
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009) , pp. 558-566
Source-language entailment modeling for translating unknown terms
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP , pp. 791-799
2008
Efficient semantic deduction and approximate matching over compact parse forests
Proceedings of TAC
2006
Integrating pattern-based and distributional similarity methods for lexical entailment acquisition
Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions , pp. 579-586