Semantic-syntactic classification of Croatian verbs
Funded: Croatian Science Foundation
Project number: IP-2022-10-8074
The project is conducted at: Institute for the Croatian Language
Duration: from 31 December 2023 to 30 December 2027
Research Team:
Mia Batinić Angster, Department of Linguistics, University of Zadar
Matea Birtić, Institute for the Croatian Language
Branimir Belaj, Faculty of Humanities and Social Sciences, University of Osijek
Ivana Brač, Institute for the Croatian Language, head of the project
Daria Lazić, Institute for the Croatian Language
Maja Matijević, Institute for the Croatian Language
Iva Nazalević Čučević, Faculty of Humanities and Social Sciences, University of Zagreb
Ana Ostroški Anić, Institute for the Croatian Language
Kristina Štrkalj Despot, Institute for the Croatian Language
Siniša Runjaić, Institute for the Croatian Language
Project summary:
The project’s goals are to determine the semantic classes to which the 500 most frequent verbs in the Croatian language belong, along with their prototype syntactic patterns and semantic roles. The project aims to theoretically explore the relationship between the semantics and syntax of Croatian verbs within a semantic class and across classes. An equally important result is a database that will contain a detailed description of verbs’ argument structure. Each verb sense will be associated with a semantic class, and based on examples from the corpora, the type of complement and semantic role will be determined. The project will investigate the semantic features of verbs, such as synonymy, antonymy, troponymy, etc. Additionally, it will explore the role of verbs in cognitive mechanisms such as metaphor and metonymy, as well as syntactic alternations and other grammatical phenomena.
By categorizing verbs into semantic classes and providing valency descriptions, similarities and differences among verbs within a semantic class and between different classes will be observed. That will lead to a better understanding of the relationship between syntax and semantics. The structured and clear presentation of results in the database and publications will facilitate the integration of the Croatian language into comparative research and potential data exchange with other similar databases.
The project is crucial for natural language processing due to its reliable and thorough description of syntactic patterns, semantic roles, and differentiated verbs’ senses. The results can be used for parsing, automatic text annotation, semantic role labeling, improving machine translation tools, creating language learning materials, etc. The outcomes will benefit Croatian language professors, students of all linguistic disciplines, Croatian and foreign linguists, and non-native speakers.
The database will continue to be updated even after the project's completion, with the goal of becoming the most comprehensive database of verbs in the Croatian language.