Leonel Figueiredo de Alencar



Leonel Figueiredo de Alencar is Professor at the Federal University of Ceará (UFC). Since 2004, he has been a faculty member of the graduate program in linguistics of the Humanities Center, carrying out research and teaching in formal grammar and computational linguistics. He has successfully supervised the master's theses and five Ph.D. dissertations to completion and has two master’s students and two doctoral students under supervision. He is the founder and coordinator of the CompLin Research Group on Natural Language and Computation.

Since March 2023, he has collaborated on the DACILAT Project at the State University of Campinas, which is funded by FAPESP (Process No. 22/09158-5). This projects aims at constructing digital corpora and implementing a rule-based machine translation system for the Brazilian indigenous languages Kadiwéu and Nheengatu.

In 1999 Leonel was awarded a scholarship from the CAPES Foundation to pursue a doctorate in Linguistics at the University of Konstanz in Germany, where he obtained his PhD with great honors (magna cum laude) in 2003. In 2012, as CAPES and DAAD Research Fellow, he was a Visiting Professor at the University of Konstanz. During his stay as a Visiting Scientist and CAPES Foundation Senior Postdoctoral Fellow at the University of Konstanz in 2013, he developed a computational grammar of Brazilian Portuguese within the LFG/XLE framework. From September 2013 to November 2015, he was a Fellow Researcher in the project entitled “Computational Processing of Portuguese for Mobile Devices” of The Group of Computer Networks, Software Engineering and Systems (GREat), hosted by the Computer Science Department of the Federal University of Ceará.

From February 2021 to February 2022, he was a Visiting Professor and Postdoctoral Researcher at the School of Applied Mathematics of the Getúlio Vargas Foundation in Rio de Janeiro (EMAp/FGV). In his postdoctoral research, he implemented, in collaboration with EMAp’s Professor and IBM’s researcher Alexandre Rademaker, a linguistic-based computational grammar for Portuguese in the HPSG formalism using the Grammar Matrix framework and the LKB-FOS system. This type of grammar models the structures of a natural language in a mathematically precise manner. It can be used by a parser to produce deep syntactic and semantic analyses of the sentences of the language. These analyses have proven extremely valuable in industrial-scale natural language processing applications such as information extraction, question answering, and machine translation.

For over 20 years, his main research area has been natural language processing and computational linguistics from a generative grammar perspective. Presently, his main research targets the development of tools and resources for the computational processing of Nheengatu (also called Modern Tupi and Língua Geral Amazônica), especially focusing on the expansion and maintenance of the UD_Nheengatu-CompLin treebank of the Universal Dependencies collection. Past research topics include generative syntax, corpus linguistics, finite-state morphology, part-of-speech tagging, machine translation, and grammar engineering with the Lexical-Functional Grammar (LFG), Grammatical Framework, and HPSG formalisms.

He has authored or coauthored articles in peer-reviewed journals and conference papers, books, and book chapters, written in Portuguese, English, and German. These publications are exhaustively listed in his curriculum vitae at the Lattes Platform of the CAPES Foundation. Many can be downloaded from his profile at Research Gate. The following textbook on computational grammar development within LFG using the Xerox Linguistic Environment (XLE), a joint work with Christoph Schwarze (University of Konstanz), deserves mention:

For more information on this book, visit:

http://linguistlist.org/issues/27/27-761.html

http://lfg-book.blogspot.com.br/

A linguist without any formal background in computer science, Leonel is an enthusiast of programming languages. He has created or participated in the creation of diverse tools and resources for the computational processing of Portuguese, French, and Nheengatu. He is the author of Aelius, a POS-tagging corpus annotation tool for Portuguese. He is one of the main creators and maintainers, together with Alexandre Rademaker, of MorphoBr, a large-coverage lexical resource for computational morphological analysis of Portuguese.

He has been a program committee member, peer reviewer, and/or editorial board member of diverse linguistic journals and conferences in both linguistics and computer science, such as the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), 1st Workshop on NLP for Indigenous Languages of Lusophone Countries (ILLC-NLP 2024, co-located with PROPOR 2024), 13th Language Resources and Evaluation Conference (LREC 2022), 11th Global WordNet Conference (GWC 2021), 68° Seminário do Grupo de Estudos Linguísticos de São Paulo (GEL 2021), 11° Congresso Internacional da Associação Brasileira de Linguística (ABRALIN50 2019), 3rd Workshop on Universal Dependencies (UDW, SyntaxFest 2019), 5th International Conference on Dependency Linguistics (Depling, SyntaxFest 2019), 27th International Conference on Computational Linguistics (COLING 2018), and 11th Symposium in Information and Human Language Technology (STIL 2017).