Do humanists need BERT?

Analysis

Do humanists need BERT?

Posted on August 12, 2019August 13, 2019
by Christopher Nunn

Introduction: Ted Underwood tests a new language representation model called “Bidirectional Encoder Representations from Transformers” (BERT) and asks if humanists should use it. Due to its high degree of difficulty and its limited success (e.g. in questions of genre detection) he concludes, that this approach will be important in the future but it’s nothing to deal with for humanists at the moment. An important caveat worth reading.

Read More

Attributing Authorship in the Noisy Digitized Correspondence of Jacob and Wilhelm Grimm | Digital Humanities

Analysis

Attributing Authorship in the Noisy Digitized Correspondence of Jacob and Wilhelm Grimm | Digital Humanities

Posted on June 11, 2018July 22, 2018
by Joris van Zundert

Introduction: Apart from its buoyant conclusion that authorship attribution methods are rather robust to noise (transcription errors) introduced by optical character recognition and handwritten text recognition, this article also offers a comprehensive read on the application of sophisticated computational techniques for testing and validation in a data curation process.

Read More

Analysis

Towards Semantic Enrichment of Newspapers: A Historical Ecology Use Case

Posted on December 12, 2017December 12, 2017
by Joris van Zundert

Introduction: Ecologists are much aided by historical sources of information on human-animal interaction. But how does one cope with the plethora of different descriptions for the same animal in the historic record? A Dutch research group reports on how to aggregate ‘Bunzings’, ‘Ullingen’, and ‘Eierdieven’ (‘Egg-thieves’) into a useful historical ecology knowledge base.

Read More

Analysis

Stylometry with R: A Package for Computational Text Analysis

Posted on October 4, 2017November 9, 2017
by Maciej Maryl

Introduction: This software paper describes ‘stylo’ – an R package for stylometric research and text processing.

Read More

Analysis

A Genealogy of Distant Reading

Posted on October 4, 2017November 9, 2017
by Maciej Maryl

Introduction: This article traces complex genealogy of distant reading to social-scientific approaches in literary studies.

Read More

Analysis

Un nouveau corpus, un peu d’XSLT et la transdisciplinarité

Posted on September 15, 2017November 9, 2017
by Delphine Montoliu

Introduction: This post analyses the sequence alignment text/image and the quality of manuscript transcriptions.

Read More

Analysis

Project RetroDig is launched at Heidelberg University

Posted on August 31, 2017November 9, 2017
by Delphine Montoliu

Introduction: This post outlines retro-digitalisation and academic analysis of paper-based documents.

Read More

Analysis

TRACER

Posted on August 31, 2017November 9, 2017
by Delphine Montoliu

Introduction: Here is the presentation of a tool which detects reused (key)words, sentences, etc. in texts in various languages.

Read More

Analysis

Music encoding and multimodality: score, text, image and performance

Posted on August 31, 2017November 9, 2017
by Delphine Montoliu

Introduction: This is a report conference on musicology and encoding.

Read More

Analysis

L’archivage des données numériques: retours d’expérience sur des données orales

Posted on August 25, 2017November 9, 2017
by Delphine Montoliu

Introduction: This post outlines a conference on an experiment of oral data storage.

Read More