Annotating

GitHub – CateAgostini/IIIF

Posted on February 10, 2022March 2, 2022
by Erzsebet Tóth-Czifra

Introduction: In this resource, Caterina Agostini, PhD in Italian from Rutgers University, Project Manager at The Center for Digital Humanities at Princeton shares two handouts of workshops she organized and co-taught on the International Image Interoperability Framework (IIIF). They provide a gentle introduction to IIIF and clear overview of features (displaying, editing, annotating, sharing and comparing images along universal standards), examples and resources. The handouts could be of interest to anyone interested in the design and teaching of Open Educational Resources on IIF.
[Click ‘Read more’ for the full post!]

Languages

Topic-specific corpus building: A step towards a representative newspaper corpus on the topic of return migration using text mining methods – Journal of Digital History

Posted on January 27, 2022January 27, 2022
by Marinella Testori Erzsébet Tóth-Czifra

Introduction: In this post, we highlight a new publication venue for Historian Digital Humanists, the Journal of Digital History where digital scholarship is presented in three layers: narrative, epistemic and data layers. These publications are therefore complex digital scholarly outputs that open a bigger window on DH research and enable readers to follow along the whole research process, execute or eventually even reproduce certain steps. We showcase this innovative publication method though highlighting a methodology paper from the first issue, Sarah Oberbichler’s and Eva Pfanzelter’s Topic-specific corpus building: A step towards a representative newspaper corpus on the topic of return migration using text mining methods.

[Click ‘Read more’ for the full post!]

Archiving

Find research data repositories for the humanities – the data deposit recommendation service

Posted on November 22, 2021February 27, 2024
by Erzsebet Tóth-Czifra

Introduction: Finding suitable research data repositories that best match the technical or legal requirements of your research data is not always an easy task. This paper, authored by Stephan Buddenbohm, Maaikew de Jong, Jean-Luc Minel and Yoann Moranville showcase the demonstrator instance of the Data Deposit Recommendation Service (DDRS), an application built on top of the re3data database specifically for scholars working in the Humanities domain. The paper also highlights further directions of developing the tool, many of which implicitly bring sustainability issues to the table.

Code

BERT for Humanists: a deep learning language model meets DH

Posted on November 9, 2021November 10, 2021
by Marinella Testori

Introduction: Awarded as Best Long Paper at the 2019 NACCL (North American Chapter of the Association for Computational Linguistics) Conference, the contribution by Jacob Devlin et al. provides an illustration of “BERT: Pre-training of Deep Biredictional Transformers for Language Understanding” (https://aclanthology.org/N19-1423/).

As highlighted by the authors in the abstract, BERT is a “new language representation model” and, in the past few years, it has become widespread in various NLP applications; for example, a project exploiting it is CamemBERT (https://camembert-model.fr/), regarding French.

In June 2021, a workshop organized by David Mimno, Melanie Walsh and Maria Antoniak (https://melaniewalsh.github.io/BERT-for-Humanists/workshop/) pointed out how to use BERT in projects related to digital humanities, in order to deal with word similarity and classification classification while relying on Phyton-based HuggingFace transformers library. (https://melaniewalsh.github.io/BERT-for-Humanists/tutorials/ ). A further advantage of this training resource is that it has been written with sensitivity towards the target audience in mind: in a way that it provides a gentle introduction to complexities of language models to scholars with education and background other than Computer Science.

Along with the Tutorials, the same blog includes Introductions about BERT in general and in its specific usage in a Google Colab notebook, as well as a constantly-updated bibliography and a glossary of the main terms (‘attention’, ‘Fine-Tune’, ‘GPU’, ‘Label’, ‘Task’, ‘Transformers’, ‘Token’, ‘Type’, ‘Vector’).

Languages

Voluntad y deseo en la filosofía moderna: un acercamiento computacional

Posted on October 15, 2021October 15, 2021
by Sara Chamosa Rabadan

Introduction: in this study, Cebral Loureda analyzes how will and desire are conveyed in: Ethics, by Spinoza; The Phenomenology of Spirit, by Hegel; The World as Will and Representation, by Schopenhauer; and Thus spoke Zarathustra, by Nietzsche. With the idea of determining theses texts’ degree of cohesion, the author follows a computational and quantitative methodology to compare and contrast them, as well as assess their internal contradictions. A normalized corpus, statistics and visualizations are employed so as to evaluate the terminology, topoi and sentimentality of these works. In relation to terminology, author’s findings revealed that Nietzsche uses a highly differentiated vocabulary from that of the other philosophers, adding marked emotional connotations to his discourse. Visualizations showed the terminological commonalities between Hegel and Schopenhauer and shed light on the former bearing the highest number of semantic connections with the other philosophers. As for topoi, results showed there is a clear dichotomic tension between conceptual and vital experience in the studied documents. Redefining this dualism, however, Cebral Loureda observed that the concrete is always intertwined with the abstract and vice versa. Regarding the sentimental dimension of these works, examination unveiled that Nietzsche’s presents the greatest negative sentimental load. In contrast, Spinoza’s is the most emotionally balanced. With all this, Cebral Loureda proves that there is a high degree of cohesion among these philosophical works, which link reason and emotion to will, time and spirit, core notions of modern philosophy and society.

Annotating

The First of May in German Literature

Posted on September 26, 2021September 28, 2021
by Erzsebet Tóth-Czifra

Introduction by OpenMethods Editor (Erzsébet Tóth-Czifra): Research on date extractions from literature brings us closer to answering big questions of “when literature takes place”. As Frank Fischer’s blog post, First of May in German literature shows, beyond mere quantification, this line of research also yields insights on the cultural significance of certain dates. In this case, the significance of 1st of May in German literature (as reflected in the “Corpus of German-Language Fiction” dataset) was determined with the help of a freely accessible data set and the open access tool HeidelTime. The brief description of the workflow is a smart demonstration of the potential of open DH methods and data sharing in sustainable ways.

Bonus one: the post starts out from briefly touching upon some of Frank’s public humanities activities.

Bonus two: mention of the Tiwoli (“Today in World Literature”) app, a fun side product built on to pof the date extraction research.

Artifacts

TAO IC Project: the charm of Chinese ceramics.

Posted on August 31, 2021
by Marinella Testori

Introduction: Among the Nominees in the ‘Best DH Dataset’ of the DH Awards 2020, the TAO IC Project (http://www.dh.ketrc.com/index.html) leads us in a fascinating journey through the world of Chinese ceramics. The project, which is developed in a collaborative way at the Knowledge Engineering & Terminology Research Center of Liaocheng (http://ketrc.com/), exploits an onto-terminology-based approach to build an e-dictionary of Chinese vessels. Do you want to know every detail about a ‘Double-gourd Vase I’? If you consult ‘Class’ in the ‘Ontology’ section (http://www.dh.ketrc.com/class.html), you can discover the component, the function, from what such a vessel is made of, and what is the method to fire it. If you also wish to see how the vase appears, under ‘Individuals’ of the same section you can read a full description of it and, also, see a picture (http://www.dh.ketrc.com/class.html). All this information is collected in the e-dictionary for each beautiful item belonging to the Ming and Qing dynasties.

[Click ‘Read more’ for the full post!]

Analysis

What Counts as Culture? Part I: Sentiment Analysis of The Times Music Reviews, 1950-2009 – train in the distance

Posted on July 8, 2021July 8, 2021
by Erzsebet Tóth-Czifra

Introduction: This blog post by Lucy Havens presents a sentiment analysis of over 2000 Times Music Reviews using freely available tools: defoe for building the corpus of reviews, VADER for sentiment analysis and Jupiter Notebooks to provide a rich documentation and to connect the different components of the analysis. The description of the workflow comes with tool and method criticism reflections, including an outlook how to improve and continue to get better and more results.

Analysis

Visualizando libros difundidos y censurados durante la Guerra Fría: 1956-1971. El caso Alfred Reisch

Posted on June 28, 2021June 28, 2021
by Paul Spence

Introduction: This article explores the potential use of data-driven methods to visualise and interpret the impact of Western efforts to influence Cold War dynamics using a covert book distribution programme. Based on a documentary corpus connected to the 2013 book by Alfred Reisch, which documented efforts by the CIA to disseminate books in the Soviet Bloc in the period 1956-1971, the authors use the Tableau Public platform to re-assess information science methods for researching historical events. Their analysis suggests that books distributed did not tend to have a more obvious political slant, but were more likely to have a broader universalist outlook. While it skirts around some of the limitations of visualization (highlighted elsewhere by Drucker and others) it offers a solid introduction to the benefits of a data-driven approach to a general audience.

Collaboration

OpenMethods Spotlights #3 Keeping a smart diary of research processes with NeMO and the Scholarly Ontology

Posted on June 22, 2021
by Erzsebet Tóth-Czifra

In the next episode, we are looking behind the scenes of two ontologies: NeMO and the Scholarly Ontology (SO) with Panos Constantopoulos and Vayianos Pertsas who tell us the story behind these ontologies and explain how they can be used to ease or upcycle your daily works as a researcher. We discuss the value of knowledge graphs, how NeMO and SO connect with the emerging DH ontology landscape and beyond, why Open Access is a precondition of populating them, the Greek DH landscape …and many more!