Pipelines for languages: not only Latin! The Italian NLP Tool (Tint)
In a recent contribution published here about the Classical Language Toolkit (https://openmethods.dariah.eu/2019/10/02/the-classical-language-toolkit-cltk-at-the-forefront-of-digital-philology-for-historical-languages/), the StanfordCore NLP (https://stanfordnlp.github.io/CoreNLP/index.html) has been mentioned among the pipelines currently available for Natural Language Processing.
As detailed on its webpage, the StandforCore NLP wishes to represent a complete Java-based set of tools for various aspects of language analysis, from annotation to dependency parsing, from lemmatization to coreference resolution. It thus provides a range of tools which can be potentially applied to other languages apart from English.
Among the languages to which the StandfordCore NLP is mainly applied there is Italian, for which the Tint pipeline has been developed as described in the paper “Italy goes to Stanford: a collection of CoreNLP modules for Italian” by Alessio Palmero Apostolo and Giovanni Moretti. Preprint arXiv:1609.06204.
On the Tint webpage http://tint.fbk.eu/ the whole pipeline can be found and downloaded: it comprises tokenization and sentence splitting, morphological analysis and lemmatization, part-of-speech tagging, named-entity recognition and dependency parsing, including wrappers under construction.
Tint thus aims at representing an answer to the need of NLP tools for Italian for which, as highlighted by Palmero Apostolo and Moretti in the introduction of their paper, “there is a lack of this kind of resources”.
Resources consulted:
Manning, Christopher D., Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J. Bethard, and David McClosky. 2014. The Stanford CoreNLP Natural Language Processing Toolkit In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55-60. [pdf] [bib]
Italy
goes to Stanford: a collection of CoreNLP modules for Italian
By
Alessio Palmero Aprosio and Giovanni
Moretti.
eprint arXiv:1609.06204.
Stanford CoreNLP – Natural Language Software
https://stanfordnlp.github.io/CoreNLP/index.html
Tint (The Italian NLP Tool)