When history meets technology. impresso: an innovative corpus-oriented perspective.

When history meets technology. impresso: an innovative corpus-oriented perspective.

Historical newspapers, already available in many digitized collections, may represent a significant source of information for the reconstruction of events and backgrounds, enabling historians to cast new light on facts and phenomena, as well as to advance new interpretations. Lausanne, University of Zurich and C2DH Luxembourg, the ‘impresso – Media Monitoring of the Past’ project wishes to offer an advanced corpus-oriented answer to the increasing need of accessing and consulting collections of historical digitized newspapers.
[…] Thanks to a suite of computational tools for data extraction, linking and exploration, impresso aims at overcoming the traditional keyword-based approach by means of the application of advanced techniques, from lexical processing to semantically deepened n-grams, from data modelling to interoperability.
[Click ‘Read more’ for the full post!]

Mining ethnicity: Discourse-driven topic modelling of immigrant discourses in the USA, 1898–1920

Mining ethnicity: Discourse-driven topic modelling of immigrant discourses in the USA, 1898–1920

Introduction: The article illustrates the application of a ‘discourse-driven topic modeling’ (DDTM) to the analysis of the corpus ChronicItaly comprising several newspapers in Italian language, appeared in the USA during the time of massive migration towards America between the end of the XIX century and the first two decades of the XX (1898-1920).

The method combines both Text Modelling (™) and the discourse-historical approach (DHA) in order to get a more comprehensive representation of the ethnocultural and linguistic identity of the Italian group of migrants in the historical American context in crucial periods of time like that immediately preceding the eruption and that of the unfolding of World War I.