“Crossing borders: Three talks on Text Analysis and Digital Humanities” 1/3

Introduction by OpenMethods Editor (Delphine Montoliu): The podcast of Melissa Terra’s English talk at the École Normale Supérieure (ENS) of Paris-France is now available.

Conférence de Melissa Terras: “Linking Crowdsourced Transcription to Automated Handwriting Recognition : Lessons from Transcribe Bentham”


For nearly seven years, the Transcribe Bentham project has been generating high quality crowdsourced transcripts of the writings of the philosopher and jurist Jeremy Bentham (1748-1832), held at University College London, and latterly, the British Library. Now with nearly 6 million words transcribed by volunteers, little did we know at the outset that this project would provide an ideal, quality controlled dataset to provide “ground truth” for the development of Handwriting Technology Recognition. This paper will look at the past, present and future of automated handwriting analysis for documents, showing how our research on the EU framework 7 Transcriptorium, and now H2020 READ projects, is working towards a service to improve the searching and analysis of digitised manuscript collections across Europe, and reusing the data created by crowdsourced, volunteer labour, for machine learning purposes.


Original publication date: 23/06/2017.

Source: “Crossing borders : Three talks on Text Analysis and Digital Humanities” organisé par le laboratoire LATTICE.