Marco Baroni:
Education and Academic/Professional History and
Activities
Index
Education
- Ph.D. in Linguistics, University of California, Los Angeles, June
2000
Dissertation Title: Distributional cues in morpheme discovery:
A computational model and empirical evidence
Dissertation
Committee: Bruce Hayes (chair), Carson Schütze, Edward Stabler,
Donca Steriade, Jody Kreiman
- M.A. in Linguistics, University of California, Los
Angeles. December 1997
Thesis Title: The representation of prefixed
forms in the Italian lexicon: Evidence from the distribution of
intervocalic [s] and [z]
Thesis Committee: Bruce Hayes (chair),
Sun-Ah Jun, Carson Schütze, Donca Steriade
- Laurea in Linguistica ("110 e lode"), University of
Padua, Italy, April 1995
Thesis Title: La relazione tra struttura
segmentale e costituenza moraica [The relation between segmental
structure and moraic constituency]
Thesis Co-Chairs: Alberto Mioni
and Laura Vanelli
Back to the index
Work experience
- January 2019 - present
Research professor
Catalan Institution for Research and Advanced Studies (ICREA)
Barcelona (Spain)
ICREA website: https://www.icrea.cat/
- November 2016 - December 2021
Research scientist and manager
Facebook Artificial Intelligence Research (FAIR)
Paris (France)
FAIR
website: https://research.fb.com/category/facebook-ai-research-fair/
- November 2006 - November 2016
Associate professor (tenured
researcher until February 2013)
Center for Mind/Brain Sciences
(CIMeC)
Department of Information Engineering and Computer
Science (DISI)
Department of Cognitive and Education Sciences
(DISCoF) (until 2012)
School of Letters and Philosophy (until
2012)
University of Trento
CIMeC
website: http://www.cimec.unitn.it
- October 2002 - October 2006
Researcher (tenured position)
Dipartimento di Studi Interdisciplinari su Traduzione, Lingue e
Cultura (SITLEC)
University of Bologna, Italy
SITLEC website: http://www.sitlec.unibo.it
- September 2001 - August 2002
Researcher (position funded by EU
R&D project FASTY)
Natural Language Processing Group
Austrian
Research Institute for Artificial Intelligence (ÖFAI)
Vienna,
Austria
ÖFAI NLP group website: http://www.ai.univie.ac.at/oefai/nlu/
FASTY project website: http://www.fortec.tuwien.ac.at/reha.e/projects/fasty/fasty.html
- July 2000 - August 2001
Computational Linguist
Language
Development Team / Core Technologies Team
Conversay
Redmond
WA, USA
Conversay website: http://www.conversay.com/
- January - December 1999
Research Assistant to Prof. Pat
Keating (position funded by NSF project KDI)
Phonetics
Laboratory
Department of Linguistics
University of California,
Los Angeles
Los Angeles CA, USA
KDI project website: http://www.hei.org/research/projects/comneur/kdipage.htm
- Summer 1998
Summer Research Intern
Spoken Language
Processes Laboratory
House Ear Institute
Los Angeles CA,
USA
Back to the index
Institutional teaching
- Fall 2007 - Spring 2015
Co-instructor (coordinator until
Winter 2014) of the Computational Linguistics (formerly: Text
Processing) course
International Cognitive Science Master,
Center for Mind/Brain Sciences, University of Trento
Philosophy and Informatics Master, School of Letters and
Philosophy, University of Trento
Master in Human Language
Technologies and Interfaces, University of Trento (until Fall
2010)
- Fall 2009 - Fall 2014
Computational Skills for Text Analysis
(formerly: Perl Programming for Text Analysis)
International
Cognitive Science Master, Center for Mind/Brain Sciences,
University of Trento
Philosophy and Informatics Bachelor,
School of Letters and Philosophy, University of
Trento
- Fall 2008 - Winter 2012
Statistics Practicum (formerly: Linear
Models in R) module of the Computational and Statistical Methods
for Data Modelling and Analysis course
Doctoral Schools in
Cognitive and Brain Sciences and Psychological Sciences and
Education, University of Trento
- Winter 2011
Coordinator of the Topic Seminar SeCS,
Seminar on Computational Semantics
Doctoral School in
Cognitive and Brain Sciences, University of Trento
- Winter 2007 - Winter 2009
Computational Lexicography lab of
Humanities Computing course
School of Letters and
Philosophy, University of Trento
- Fall 2007 - Winter 2009
Introduction to Perl for Text
Processing (Humanities Computing module)
School of Letters and
Philosophy, University of Trento
- Winter 2009
Co-coordinator and co-instructor of the the
Topic Seminar SExIE, Seminar on Extreme Information
Extraction
Doctoral School in Cognitive and Brain Sciences,
University of Trento
- Winter 2009
Lexicography module of Applied Linguistics
course
School of Letters and Philosophy, University of
Trento
- Winter 2008
Lexical Semantics module of General
Linguistics course
School of Letters and Philosophy,
University of Trento
- Fall 2007 - Winter 2008
Co-coordinator and co-instructor
of the Topic Seminar EviL, Evidence in Linguistics
Doctoral School in Cognitive and Brain Sciences, University of
Trento
- Winter 2007
Introduction to Corpora module of
Humanities Computing courses
School of Letters and Philosophy,
University of Trento
- Winter 2007
Collocations module of Applied Linguistics
course
School of Letters and Philosophy, University of
Trento
- Winter 2005 - Winter 2006
Automated Acquisition of Lexicon and
Terminology module of Terminology and Specialized Languages (I and
II).
SSLMIT, University of Bologna
- Winter 2004 - Fall 2005
Computational Linguistics
SSLMIT, University of Bologna
- Fall 2002 - Fall 2006
Phonetics/Phonology/Morphology
modules of General Linguistics course
SSLMIT,
University of Bologna
- Fall 1996 - Fall 1998
Teaching Assistant for the courses
Introduction to Linguistics, Experimental Phonetics and
Introduction to General Phonetics
Department of Linguistics,
University of California, Los Angeles
Back to the index
Other activities
- Please see the Lectures and Presentations section of my
Publications and Presentations page for list of invited and keynote
talks
- Doctoral student supervision: Federico Boschetti (2009), Amaç
Herdagdelen (2010, UniTN best dissertation award), Gerhard Kremer
(2010), Elia Bruni (2013), Eva Maria Vecchi (co-supervision, 2013),
German Kruszewski (2016), Angeliki Lazaridou (2016), Nghia The Pham
(2016), Rahma Chaabouni (co-supervision, 2021), Roberto Dessi (2024), Olivier Ruest (co-supervision, 2024), Mateo Mahaut (in progress), Nathanael Carraz Rakotonirina (in progress), Emily Cheng (in progress)
- External examiner of PhD candidates: Diana Passino (University of
Padua, 2002), Simona Colombo (University of Turin, 2007), Jan
Pomikalek (Masaryk University Brno, 2011), Kateryna Tymoshenko (ICT
School, University of Trento, 2012), Paul Nulty (University College,
Dublin, 2013), Elias Iosif (Technical University of Crete, 2013),
Lorenzo Dell'Arciprete (University of Rome Tor Vergata, 2013), Paolo
Annesi (University of Rome Tor Vergata, 2013), Gozde Ozbal (ICT
School, University of Trento, 2013), Dieu-Thu Le (ICT School,
University of Trento, 2014), Aliaksei Severyn (ICT School, University
of Trento, 2015), Ali Orkan Bayer (ICT School, University of Trento,
2015), Abdellah Fourtassi (Ecole Normale Superieure, Paris, 2015),
Phong Le (Institute for Logic, Language and Computation, University of
Amsterdam, 2016), Manaal Faruqui (Language Technologies Institute,
Carnegie Mellon University, 2016), Douwe Kiela (Faculty of Computer
Science and Technology, University of Cambdrige, 2016), Felix Hill
(Faculty of Computer Science and Technology, University of Cambdrige,
2016), Ryan Lowe (Computer Science, McGill University, 2020), Andrea
(Daniela) Mihai (Faculty of Engineering and Physical Sciences,
University of Southampton, 2022), Karim Lasri (Ecole Normale
Superieure and PSL, Paris/CoLing Lab, University of Pisa, 2023),
Róbert Csordás (IDSIA, Università della Svizzera Italiana, 2023), Luca Moschella (Università La Sapienza, Rome, 2024),
Abhinav Gupta (Université de Montréal, 2024)
- Editorial board member
of the Transactions of the Association for Computational Linguistics, 2017-present
- Editorial board member
of Computational Linguistics, 2014-2016
- Area chair of NeurIPS 2024 (Thirty-eighth Annual Conference on Neural Information Processing Systems), December 2024
- Area chair of ICLR 2024 (Twelfth International Conference on Learning Representations), May 2024
- Member of the ERC Consolidator Grant SH4 Panel (The Human Mind and Its Complexity), 2019, 2021, 2023, 2025
- External tenure track evaluator for the University of Washington (Seattle, USA), 2023
- Remote panel evaluator for the ERC Synergy Grant, 2022, 2023, 2024
- Co-taught tutorial on Emergent Language-Based Coordination In Deep Multi-Agent Systems at EMNLP 2022
- Information officer
of SIGSEM, the Special Interest Group on Semantics
of the Association for Computational Linguistics, 2013-2022
- External habilitation evaluator for University Paul Sabatier (Toulouse III), 2022
- External promotion evaluator for the University of Leeds (UK), 2022
- Area chair of ICLR 2021 (Ninth International Conference on Learning Representations), May 2021
- Coordinated Birds-of-a-Feather and Group Mentoring sessions as part of the DI initiatives at EACL 2021 (16th Conference of the European Chapter of the Association for Computational Linguistics), April 2021
- External tenure track evaluator for McGill University (Montreal, Canada), 2020
- Interpretability and Analysis of Models for NLP area co-chair at
ACL 2020 (Annual meeting of the Association for Computational Linguistics), July 2020
- Organized the Language Emergence, Information Theory and All That Workshop, Barcelona (Spain), February 2020
- Co-organized the Lorentz Workshop on Compositionality in Brains and Machines, Leiden (the Netherlands), August 2019
- Co-organized the FAIR Understanding Human and Machine Intelligence Workshop, New York, May 2019
- Co-organized the Machine Intelligence Workshop at NIPS, Barcelona, December 2016
- Organizer of the CLIC Research Colloquium series, 2010-2015
- Coordinator of
the Language and Multimodal Interaction track of
the Cognitive Science Master program at the University
of Trento, 2012-2014
- Co-organizer and coordinator of
the BA
and MA majors in Philosophy and Informatics of the
Philosophy program at the University of Trento, 2009-2013
- Semantics area co-chair at
EMNLP 2015
(Conference on Empirical Methods in Natural Language Processing), September
2015
- External evaluator of Marco Turchi's tenure track at FBK
(Trento), 2013-2015
- Co-taught tutorial on A practical introduction to
distributional semantics
(slides) at
the Symposium
on Semantic Text Processing, Bar-Ilan University, November
2014
- Visiting fellow at
the Center for Advanced Studies of
Ludwig-Maximilians-Universität, Munich, October 2014
- Ca-taught mini-course on Composition in Distributional
Semantics at ESSLLI 2014 (see
the COMPOSES project page for the slides),
Tübingen, August 2014
- Co-organized SemEval-2014 Task 1 (Evaluation of compositional
distributional semantic models on full sentences through semantic
relatedness and textual entailment)
- Tutorial co-chair of EACL 2014
- Cognitive Modeling area co-chair for CLIC.it (First Italian
Computational Linguistics Conference), December 2014
- Co-taught tutorial on Visual Features for Linguist
(slides) at ACL 2013
- Program co-chair of *SEM 2013
(Second Joint Conference on Lexical and Computational Semantics), June
2013
- Gave seminar A success story in research funding in the
Crash Course on Research Funding of the University of Trento, May
2012
- Gave tutorial on Compositionality in Distributional Semantics
at EACL
2012 (see the COMPOSES project page for the slides), Avignon,
April 2012
- Student mentor at EACL 2012
- Semantic Models area chair for
*SEM 2012
(First Joint Conference on Lexical and Computational Semantics), June
2012
- Prepared shared evaluation task for GEMS 2011: GEometrical Models of Natural Language Semantics, July 2011
- Taught mini-course on Distributional Semantics as part of
the ADT-TM school of the MEMOTEF Department, La
Sapienza University, Rome, January 2011
- My team participated in the EVALITA 2009 Lexical Substitution track
- Taught mini-course on Distributional Semantics at
the GLIF center of the Universitat Pompeu Fabra,
Barcelona, June 2009
- In program committee of ESSLLI 2009, Bordeaux, July 2009
- Taught at the TRIPLE Winter School on The lexicon: analysis methods, models and applications, January 2009
- Co-organizer of
the ESSLLI
2008 Distributional Lexical Semantics Workshop,
Hamburg, August 2008
- Co-taught mini-course on Statistical programming in R for
computational linguists at
the Computational
Linguistics Fall School of the German Linguistics Association,
University of Potsdam, September 2007
- My team participated in
the EVALITA 2007 initiative in
the POS Tagging track: our system was a close second in the evaluation
- Co-organizer of
the Contextual
Information in Semantic Space Models workshop at Context 07,
Roskilde University, August 2007
- Invited visiting scholar at the National Institute for Japanese
Language, Tokyo, Japan, July-August 2007
- Co-coordinator of
the CLEANEVAL
shared task on automated cleaning of Web data, 2007
- Co-organizer of
the LCT
Colloquia of the Universities of Bolzano and Trento
- Secretary
of SIGWAC, the Special
Interest Group on Web as Corpus of the Association for Computational
Linguistics, 2007-2009
- Co-taught mini-course on Counting words: an introduction to
lexical statistics at ESSLLI 2006, Malaga, August 2006
- Taught mini-course Morphology and corpora: the case of
quantitative productivity at the University of Granada, May
2006
- Co-organizer of workshop on The Web as Corpus, EACL 2006,
Trento, April 2006
- Co-taught intensive mini-course Statistical Methods for Corpus
Exploitation at EURAC, Bolzano, October 2005
- Co-organizer of workshop on The Web as Corpus, Corpus
Linguistics 2005, Birmingham
(http://sslmit.unibo.it/~baroni/web_as_corpus_cl05.html),
July 2005
- Visiting scholar at the Austrian Research Institute for Artificial
Intelligence (ÖFAI), Vienna, Austria, May-August 2005
- Taught mini-course Statistics for Corpus Linguistics,
SITLEC, Forlì, April-May 2005
- Co-coordinator of the WaCky project (http://wacky.sslmit.unibo.it/)
- Co-organized workshop on The Web as Corpus, SSLMIT/SITLEC,
January 2005
- Administrator of the
site http://e-learning.sslmit.unibo.it/,
2003-2006
- Secretary of the entrance exam committee, SSLMIT, September 2003,
September 2004, September 2005.
- Co-organized and co-taught intensive mini-course A Practical
Introduction to Corpus Work, Bertinoro University Center, October
2003
- Co-coordinated the CORAL (CORpora e Apprendimento
Linguistico) e-learning project (http://www.e-learning.sslmit.unibo.it/COR/)
- Helped organizing the Interdepartmental Workshop on Science and
Common Sense, University of Padua, May 1995
- Reviewer for: ACL (2008,
2009, 2011-2018, 2024, awards committee), UK Research and Innovation (UKRI)
(2024), Netherlands Organisation for Scientific Research
(NWO) (2015, 2024), CVPR (2023), EMNLP (2007,
2010, best reviewer award, 2011, 2013, 2017, 2018, 2022,
2023), WiNLP (2017-2020, 2022, 2023), Current Directions
in Psychological Science (2023), Isogloss
(2023), Polity Press (2023), Deutsche
Forschungsgemeinschaft (DFG) (2022), Fundación Española
para la Ciencia y la Tecnología (2022), Cognition
(2021), Trends in Cognitive Sciences (2021), NAACL
(2013, 2015, 2016, 2018, 2021 as member of the best-paper award
committee), European Research Council (ERC) Starting and
Consolidator Grants (2012, 2018, 2020), ICLR (2016-2018), Learning in Humans
and Machines Workshop (L2HM) (2018), Transactions of the
Association for Computational Linguistics (in standing
committee of reviewers, 2012-2017), NIPS (2017), Journal
of Artificial Intelligence Research (2012, 2014,
2017), Natural Language Engineering (2009-2010,
2017), Evaluating General Artificial Intelligence Workshop
(2017), Centre for Linguistic Theory and Studies in Probability
(CLASP, University of Gothenburg) (2017), Israel Science
Foundation (2016, 2017), RepEval: Workshop on Evaluating
Vector Space Representations for NLP (2016, 2017), British
Academy (2017), ICML (2016), Computational
Linguistics (2011, 2013-2016), European Research Council
(ERC) Advanced Grants (2016), ESSLLI DSALT Workshop
(2016), CLIC-it (2016), CONLL (2014,
2015), Heller Research Fellowship in Computer Science
(Cambridge) (2015), Journal of Memory and Language
(2015), EMNLP Vision and Language Workshop (VL15)
(2015), Continuous Vector Space Models and their
Compositionality Workshop (2014-2015), NAACL Workshop on
Cognitive Modeling and Computational Linguistics
(2015), Workshop on Multimodal Semantics for Robotic Systems
(MuSRobS) (2015), CHIST-ERA (European Coordinated Research
on Long-term Challenges in Information and Communication Sciences
& Technologies ERA-Net) (2015), IWCS (2011, 2013,
2015), NetWordS Word Knowledge and Word Usage Conference
(2015), Italian Ministry of Education and Research SIR
projects (2014), Natural Sciences and Engineering Research
Council of Canada (2014), COLING (2008, 2010,
2014), EACL (2008, 2014), *SEM (2014), EACL
Cognitive Aspects of Computational Language
Acquisition/Learning/Loss Workshop (2009, 2012, 2014), EACL
Student Workshop (2014), SIGIR Workshop on Semantic
Matching in Information Retrieval (2014), Towards a Formal
Distributional Semantics Workshop at IWCS (2013), V&L
Net (EPSRC Network on Vision and Language) (2013), Behavior
Research Methods (2012, 2013), Quantitative Investigations
in Theoretical Linguistics (QITL) (2006, 2008, 2011,
2013), Generative Lexicon (2009, 2013), AI
Communications (2013), Cognitive Science (2011,
2012), Language and Linguistics Compass (2012), Brian
Informatics (2012), ACL Joint Workshop on Statistical
Parsing and Semantic Processing of Morphologically Rich Languages
(2012), Linguistic Evidence (2012), Italian Ministry
of Education and Research (PRIN projects) (2012), Italian
National Agency for the Evaluation of the University and Research
System (ANVUR) (2012), Bar-Ilan University (2012),
Journal of the Royal Society Interface (2011, 2012),
Research Foundation Flanders (FWO) (2011, 2012), IJCNLP
(2008, 2011), Geometrical Models of Natural Language Semantics
(GEMS) Workshop (2009-2011), Distributional Semantics and
Compositionality (DISCO) Workshop (2011), Language Resources and Evaluation
Journal (2007-2008, 2011), ACM Transactions on Speech and
Language Processing (2010), Cognitive Aspects of Computational
Language Acquisition (Springer book, 2010), Bolzano's Province
Research Office (2010), SemEval
(2010), ESSLLI Compositionality and Distributional Semantic
Models Workshop (2010), NAACL Computational Neurolinguistics
Workshop (2010), CogSci Distributional Semantics beyond Concrete Concepts
Workshop (2009), RANLP Workshop on NLP
Methods and Corpora in Translation, Lexicography, and Language
Learning (2009), IEEE Intelligent
Systems (2008), Human Judgments in Computational Linguistics
Workshop at COLING (2008), LREC (2008, 2010), the ESSLLI
Student Workshop (2008), Italian Journal of Linguistics
(2008), Cognitive Linguistics (2008), the UK Economic and
Social Research Council (ESRC) (2007), Europhysics Letters
(2007), Artificial Intelligence Journal (2007), WAC3
Workshop (2007), the US National Science Foundation (NSF)
(2005-2007), Morphology (2007), AMML Workshop at RANLP
2007 (2007), Web Genres Colloquim at Corpus Linguistics
2007 (2007), Corpus linguistics: An international handbook
(2006), Languages in Contrast (2006), Journal of the
International Phonetic Association (2002, 2003), Phonetica
(2001), Journal of the Acoustical Society of America (2000),
Journal of Phonetics (1998)
Back to the index
Funded projects and scholarships
- Geometric Approaches to Language Model Analysis, Apple Academic Collaboration Grant, 2025-2026, academic PI
- ALiEN, ERC Advanced Grant, 2022-2027, PI
- COMPOSES, ERC Starting Grant, 2011-2016, PI
- LOVe, Marie Sklodowska-Curie project, 2015-2017, supervisor
- European Network on Integrating Vision and Language (iV&L
Net), ICT COST Action, 2014-2018, Management Committee member and
dissemination officer until 2015
- Web 2.0 as Corpus, Google Research Award, 2010-2013, PI
- Conceptual Representation in the Blind: Empirical Data and Computational
Simulations, PRIN project, 2010-2012, PI
- PAISA', FIRB project, 2009-2011, PI
- LiveMemories, PAT project, 2008-2010, participant (contributed to proposal writing)
- LiMiNE, University of Bologna strategic programme, 2007-2009,
external consultant (contributed to proposal writing)
- Invited Scholar Fellowship, the National Institute for Japanese
Language, Tokyo, 2007
- CompoNet, PRIN project, 2006-2008, participant (contributed to
proposal writing)
- Marco Polo Scholarship for a research period abroad, University of
Bologna, 2005
- Chancellor Fellowship, University of California, Los Angeles,
1995-2000
- Summer School Fellowship, San Marino Center for Semiotic and
Cognitive Studies, 1995
- Education Abroad Program Fellowship, University of California, Los
Angeles, 1993-1994
- Summer School Fellowship, University of Bucharest, Romania,
1993
Back to the index
Contact
Email address: mbaroni AT gmail com
Snail mail address: Marco Baroni, Departament de Traduccio i Ciencies del Llenguatge, Universitat Pompeu Fabra, Roc Boronat 138, 08018 Barcelona, Spain.
Phone: +34 93 542 2244
Back to the index
Back to Marco's Page