Mining ethnicity: Discourse-driven topic modelling of immigrant discourses in the USA, 1898–1920

https://openmethods.dariah.eu/2020/02/23/mining-ethnicity-discourse-driven-topic-modelling-of-immigrant-discourses-in-the-usa-1898-1920/ OpenMethods introduction to: Mining ethnicity: Discourse-driven topic modelling of immigrant discourses in the USA, 1898–1920 2020-02-23 11:36:00 Introduction: The article illustrates the application of a 'discourse-driven topic modeling' (DDTM) to the analysis of the corpus ChronicItaly comprising several newspapers in Italian language, appeared in the USA during the time of massive migration towards America between the end of the XIX century and the first two decades of the XX (1898-1920). The method combines both Text Modelling (™) and the discourse-historical approach (DHA) in order to get a more comprehensive representation of the ethnocultural and linguistic identity of the Italian group of migrants in the historical American context in crucial periods of time like that immediately preceding the eruption and that of the unfolding of World War I. Marinella Testori https://academic.oup.com/dsh/advance-article/doi/10.1093/llc/fqz068/5601610 Blog post Contextualizing Discovering English Italian Multimodal Text Topic Modeling Association for Computational Linguistics CDA Chronicling America Creative Commons critical discourse analysis Cronaca sovversiva Dante Alighieri Destination America digital humanities Digital Scholarship in the Humanities Discourse & Society discourse analysis Ellis Island Gantt chart Guglielmo Marconi Harvard University Press historical records il giornale interactional sociolinguistics Irish American Italian American Latent Dirichlet Allocation Library of Congress Luigi Galleani machine learning Mass Media Northern Italians open access Oscar Handlin Print Culture Statue of Liberty Text Mining World War I

Introduction by OpenMethods Editor (Marinella Testori):

The article illustrates the application of a ‘discourse-driven topic modeling’ (DDTM) to the analysis of the corpus ChronicItaly comprising several newspapers in Italian language, appeared in the USA during the time of massive migration towards America between the end of the XIX century and the first two decades of the XX (1898-1920).

The method combines both Text Modelling (™) and the discourse-historical approach (DHA) in order to get a more comprehensive representation of the ethnocultural and linguistic identity of the Italian group of migrants in the historical American context in crucial periods of time like that immediately preceding the eruption and that of the unfolding of World War I.

Thanks to the method, it has been possible to highlight a series of topics and related top words among those mainly characterizing discourses and conversations among the members of the Italian community of migrants spread between California and West Virginia, to mention two of the five US states where the newspapers comprised in the corpus were published (the other three states are Massachusetts, Pennsylvania and Vermont).

Moreover, the method itself wishes to offer a contribution to the resolution of a couple of current challenges in DH, i.e., the development of a better understanding of the relation between what is ‘digital’ and what pertains to the ‘humanities’, along with the necessity of overcoming the traditional distinction between ‘close’ and ‘distant’ reading techniques in favor of a blended approach.

References:

Viola L. (2018). ChroniclItaly: A Corpus of Italian American Newspapers from 1898 to 1920.Utrecht: Utrecht University. https://doi.org/10.24416/UU01-T4YMOW 

Source: Mining ethnicity: Discourse-driven topic modelling of immigrant discourses in the USA, 1898–1920 | Digital Scholarship in the Humanities | Oxford Academic