BERT for Humanists: a deep learning language model  meets DH

BERT for Humanists: a deep learning language model meets DH

Introduction: Awarded as Best Long Paper at the 2019 NACCL (North American Chapter of the Association for Computational Linguistics) Conference, the contribution by Jacob Devlin et al. provides an illustration of “BERT: Pre-training of Deep Biredictional Transformers for Language Understanding” (https://aclanthology.org/N19-1423/).

As highlighted by the authors in the abstract, BERT is a “new language representation model” and, in the past few years, it has become widespread in various NLP applications; for example, a project exploiting it is CamemBERT (https://camembert-model.fr/), regarding French. 

In June 2021, a workshop organized by David Mimno, Melanie Walsh and Maria Antoniak (https://melaniewalsh.github.io/BERT-for-Humanists/workshop/) pointed out how to use BERT in projects related to digital humanities, in order to deal with word similarity and classification classification while relying on Phyton-based HuggingFace transformers library. (https://melaniewalsh.github.io/BERT-for-Humanists/tutorials/ ). A further advantage of this training resource is that it has been written with sensitivity towards the target audience in mind:  in a way that it provides a gentle introduction to complexities of language models to scholars with education and background other than Computer Science.

Along with the Tutorials, the same blog includes Introductions about BERT in general and in its specific usage in a Google Colab notebook, as well as a constantly-updated bibliography and a glossary of the main terms (‘attention’, ‘Fine-Tune’, ‘GPU’, ‘Label’, ‘Task’, ‘Transformers’, ‘Token’, ‘Type’, ‘Vector’).

DH Research Software Engineers – For We Are Many

DH Research Software Engineers – For We Are Many

Introduction: This white paper is an outcome of a DH2019 workshop dedicated to foster closer collaboration among technology-oriented DH researchers and  developers of tools to support Digital Humanities research. The paper briefly outlines the most pressing issues in their collaboration and addresses topics such as: good practices to ease mutual understanding between scholars and researchers; software development and academic career and recognition; or sustainability and funding.

Little package, big dependency

Little package, big dependency

Introduction: The world of R consists of innumerous packages. Most of them have very little download rates because they are limited to certain functions as part of a larger argument. Based on a surprising experience with the small package clipr Matthew Lincoln shares his thoughts about this reception phenomenon especially in the digital humanities.

The Research Software Directory and how it promotes software citation

The Research Software Directory and how it promotes software citation

Introduction: The Research Software Directory of the Netherlands eScience Institute provides easy access to software, source code and its documentation. More importantly, it makes it easy to cite software, which is highly advisable when using software to derive research results. The Research Software Directory positions itself as a platform that eases scientific referencing and reproducibility of software based research—good peer praxis that is still underdeveloped in the humanities. 

From Hermeneutics to Data to Networks: Data Extraction and Network Visualization of Historical Sources

From Hermeneutics to Data to Networks: Data Extraction and Network Visualization of Historical Sources

Introduction: This lesson by Marten Düring from the “Programming Historian-Website” gently introduces novices to the topic to Network Visualisation of Historical Sources. As a case study it covers not only the general advantages of network visualisation for humanists but also a step-by-step explanation of the process from extraction of the data until the visualization (using the Palladio-tool). This lesson has also been translated into Spanish and includes many useful references for further reading.

Creating Web APIs with Python and Flask | Programming Historian

Creating Web APIs with Python and Flask | Programming Historian

Introduction: This very complete tutorial by Patrick Smyth will help digital humanists or any interested person on digital technologies applied to projects how to make data more accessible to users through APIs (Application Programming Interfaces). After explaining the basics about APIs and databases, an API is built and put into practice. Python 3 and the Flask are the web frameworks used for developing this API.

Teaching Quantitative Methods: What Makes It Hard (in Literary Studies)

Introduction: This article reflects on the lessons learnt by the author as he first taught a graduate course in digital analysis of literary texts. He stresses the importance of methodologies over technologies, the need for well-curated, community-created teaching datasets and the implications of the practical, discipline-based organisation of the curricula.