When history meets technology. impresso: an innovative corpus-oriented perspective.

https://openmethods.dariah.eu/2020/07/15/when-history-meets-technology-impresso-an-innovative-corpus-oriented-perspective/ OpenMethods introduction to: When history meets technology. impresso: an innovative corpus-oriented perspective. 2020-07-15 09:06:41 Historical newspapers, already available in many digitized collections, may represent a significant source of information for the reconstruction of events and backgrounds, enabling historians to cast new light on facts and phenomena, as well as to advance new interpretations. Lausanne, University of Zurich and C2DH Luxembourg, the ‘<em>impresso</em> – Media Monitoring of the Past’ project wishes to offer an advanced corpus-oriented answer to the increasing need of accessing and consulting collections of historical digitized newspapers. [...] Thanks to a suite of computational tools for data extraction, linking and exploration, <em>impresso</em> aims at overcoming the traditional keyword-based approach by means of the application of advanced techniques, from lexical processing to semantically deepened n-grams, from data modelling to interoperability. [Click ‘Read more’ for the full post!] Marinella Testori Blog post Content Analysis Contextualizing Digital Humanities English Enrichment Interpretation Machine Learning Modeling Named Entities Named Entity Recognition Network Analysis Persons Relational Analysis Research Activities Research Objects Research Techniques Text Tools Artificial intelligence applications Computational linguistics Cultural policies of the European Union Humanities impresso n-grams Natural language processing newspaper corpus newspapers OCR Optical Character Recognition

Historical newspapers, already available in many digitized collections, may represent a significant source of information for the reconstruction of events and backgrounds, enabling historians to cast new light on facts and phenomena, as well as to advance new interpretations.

Developed by EPFL-Lausanne, University of Zurich and C2DH Luxembourg, the ‘impresso – Media Monitoring of the Past’ project wishes to offer an advanced corpus-oriented answer to the increasing need of accessing and consulting collections of historical digitized newspapers. The project has been already brought to the general attention by the network Europeana in the following interview with one of the developers, Dr Matteo Romanello: https://pro.europeana.eu/post/mining-and-exploring-200-years-of-newspapers-the-impresso-project

Thanks to a suite of computational tools for data extraction, linking and exploration, impresso aims at overcoming the traditional keyword-based approach by means of the application of advanced techniques, from lexical processing to semantically deepened n-grams, from data modelling to interoperability. In this way, the gap between raw lexical data and the correspondent semantic level may be filled in order to give birth to a real, historical knowledge base (KB).

As highlighted by Romanello in the above-mentioned interview, impresso provides a multiple-featured interface developed according to a co-design approach for the visualization and consultation of the corpus of historical newspapers from Switzerland, Luxembourg and other countries (76 newspapers available in January 2020).

Along with the implementation of techniques, impresso wishes to contribute to the epistemological debate around the relation between different types of sources for historical research, as well as to the main challenges posed by digital scholarship for history.

References:

impresso website: https://impresso-project.ch/

impresso interface: https://impresso-project.ch/app/

Daley, Beth. 2019. “Mining and exploring 200 years of newspapers: the impresso project” (https://pro.europeana.eu/post/mining-and-exploring-200-years-of-newspapers-the-impresso-project).

Internet Archive link: https://web.archive.org/web/20200624092937/https://pro.europeana.eu/post/mining-and-exploring-200-years-of-newspapers-the-impresso-project

Leave a Reply

Your email address will not be published. Required fields are marked *