Organizers:
- Pietro Dini
Università di Pisa - Silvia Piccini
Istituto di Linguistica Computazionale "A. Zampolli", Pisa - Adriano Cerri
Università di Pisa
Description:
Over the last two decades, numerous significant projects have been undertaken to preserve, document, and study the legacy of Old Baltic linguistic monuments. Some of them focus on text and resource repositories of a single linguistic tradition (cf. PKPDB for Old Prussian; SENIE for Old Latvian; SR, SLIEKKAS, ALQ and ALKT for Old Lithuanian), while others concentrate on specific authors (e.g. CorDon; SBCB), or specific textual genres (e.g. PosTiMe on Old Lithuanian Lutheran postils, OWNW on Old Baltic catechisms). However, the research landscape remains fragmented, characterized by disparate datasets and methodologies that lack integration. For instance, existing linguistic corpora and lexicons often employ divergent annotation practices and incompatible formats, making data interoperability inconvenient. Similarly, digital archives of ancient texts frequently adhere to project-specific schemas, limiting their accessibility for broader computational applications.
As a result, there is an increasing need nowadays for the establishment of a cohesive ecosystem that prioritizes the FAIRness of research data, metadata, and infrastructure, thus aligning with the principles of open science. Achieving this requires a collaborative effort among philologists, linguists and technologists to define unified standards, develop robust methodologies, and create frameworks that bridge the gap between traditional scholarship and cutting-edge technology.
This section aims to foster dialogue between tradition and innovation, focusing on the study of ancient Baltic texts through the application of new technologies, in the service of philological and linguistic research. In particular, contributions are invited on topics including, but not limited to:
- the design and implementation of ontologies and lexicographical resources based on the semantic web to improve access and interoperability of linguistic data;
- the digitization and analysis of ancient Baltic texts (manuscripts or printed) for the creation of digital archives;
- the annotation of corpora, both synchronic and diachronic, for linguistic and philological research;
the application of computational tools for automatic language processing (NLP) in Baltic languages, including lesser-studied varieties.
We also invite researchers to present current or future projects focused on the valorization of ancient Baltic linguistic monuments through the use of digital and computational methodologies. Colleagues working from this perspective will have the opportunity to share their experiences, results, and new ideas.
References
- ALKT = Kritische Edition altlitauischer Kleintexte vom Überlieferungsbeginn bis 1700.
- ALQ = Altlitauisches Quellenverzeichnis.
- CorDon = Digital Old Lithuanian: Corpus of Kristijonas Donelaitis (1714–1780).
- OWNW = Old Words for a New World: Translating Christianity to Baltic Pagans.
- PosTiMe = Postil Time Machine.
- SENIE = Latviešu valodas seno tekstu korpuss.
- SBCB = Samuelio Boguslavo Chylinskio Naujasis Testamentas. Rankraščio tyrimas, faksimilinis ir interaktyvus skaitmeninis leidimas.
- SLIEKKAS = Technological and scientific basis for the linguistic annotation of Old Lithuanian Corpus.
- PKPDB = Prūsų kalbos paveldo duomenų bazė.
- SR = Senieji raštai / Database of Old Writings.
Registration:
If you would like to submit a paper for this workshop, please fill out the registration form (will be available in February 2025).