https://openmethods.dariah.eu/2020/02/23/mining-ethnicity-discourse-driven-topic-modelling-of-immigrant-discourses-in-the-usa-1898-1920/
OpenMethods introduction to: Mining ethnicity: Discourse-driven topic modelling of immigrant discourses in the USA, 1898–1920
2020-02-23 11:36:00
Introduction: The article illustrates the application of a 'discourse-driven topic modeling' (DDTM) to the analysis of the corpus ChronicItaly comprising several newspapers in Italian language, appeared in the USA during the time of massive migration towards America between the end of the XIX century and the first two decades of the XX (1898-1920).
The method combines both Text Modelling (™) and the discourse-historical approach (DHA) in order to get a more comprehensive representation of the ethnocultural and linguistic identity of the Italian group of migrants in the historical American context in crucial periods of time like that immediately preceding the eruption and that of the unfolding of World War I.
Marinella Testori
https://academic.oup.com/dsh/advance-article/doi/10.1093/llc/fqz068/5601610
Blog post
Contextualizing
Discovering
English
Italian
Multimodal
Text
Topic Modeling
Association for Computational Linguistics
CDA
Chronicling America
Creative Commons
critical discourse analysis
Cronaca sovversiva
Dante Alighieri
Destination America
digital humanities
Digital Scholarship in the Humanities
Discourse & Society
discourse analysis
Ellis Island
Gantt chart
Guglielmo Marconi
Harvard University Press
historical records
il giornale
interactional sociolinguistics
Irish American
Italian American
Latent Dirichlet Allocation
Library of Congress
Luigi Galleani
machine learning
Mass Media
Northern Italians
open access
Oscar Handlin
Print Culture
Statue of Liberty
Text Mining
World War I
Introduction by OpenMethods Editor (Marinella Testori):
The article illustrates the application of a ‘discourse-driven topic modeling’ (DDTM) to the analysis of the corpus ChronicItaly comprising several newspapers in Italian language, appeared in the USA during the time of massive migration towards America between the end of the XIX century and the first two decades of the XX (1898-1920).
The method combines both Text Modelling (™) and the discourse-historical approach (DHA) in order to get a more comprehensive representation of the ethnocultural and linguistic identity of the Italian group of migrants in the historical American context in crucial periods of time like that immediately preceding the eruption and that of the unfolding of World War I.
Thanks to the method, it has been possible to highlight a series of topics and related top words among those mainly characterizing discourses and conversations among the members of the Italian community of migrants spread between California and West Virginia, to mention two of the five US states where the newspapers comprised in the corpus were published (the other three states are Massachusetts, Pennsylvania and Vermont).
Moreover, the method itself wishes to offer a contribution to the resolution of a couple of current challenges in DH, i.e., the development of a better understanding of the relation between what is ‘digital’ and what pertains to the ‘humanities’, along with the necessity of overcoming the traditional distinction between ‘close’ and ‘distant’ reading techniques in favor of a blended approach.
References:
Viola L. (2018). ChroniclItaly: A Corpus of Italian American Newspapers from 1898 to 1920.Utrecht: Utrecht University. https://doi.org/10.24416/UU01-T4YMOW
Source: Mining ethnicity: Discourse-driven topic modelling of immigrant discourses in the USA, 1898–1920 | Digital Scholarship in the Humanities | Oxford Academic