GROBID: when data extraction becomes a suite

GROBID: when data extraction becomes a suite

Introduction: GROBID is an already well-known open source tool in the field of Digital Humanities, originally built to extract and parse bibliographical metadata from scholarly works. The acronym stands for GeneRation Of BIbliographic Data.
Shaped by use cases and adoptions to a range of different DH and non-DH settings, the tool has been progressively evolved into a suite of technical features currently applied to various fields, like that of journals, dictionaries and archives.
[Click ‘Read more’ for the full post!]

Document ALL the things!| The Center for Digital Humanities at Princeton

Document ALL the things!| The Center for Digital Humanities at Princeton

Introduction: Sustainability questions such as how to maintain digital project outputs after the funding period, or how to keep aging code and infrastructure that are important for our research up-to-date are among the major challenges DH projects are facing today. This post gives us a sneak peek into the solutions and working practices from the Center for Digital Humanities at Princeton. In their approach to build capacity for sustaining DH projects and preserve access to data and software, they view projects as collaborative and process-based scholarship. Therefore, their focus is on implementing project management workflows and documentation tools that can be flexibly applied to projects of different scopes and sizes and also allow for further refinement in due case. By sharing these resources together with their real-life use cases in DH projects, their aim is to benefit other scholarly communities and sustain a broader conversation about these tricky issues.

Transkribus & Magazines: Transkribus’ Transcription & Recognition Platform (TRP) as Social Machine…

Transkribus & Magazines: Transkribus’ Transcription & Recognition Platform (TRP) as Social Machine…

Introduction: This article proposes establishing a good collaboration between FactMiners and the Transkribus project that will help the Transkribus team to evolve the “sustainable virtuous” ecosystem they described as a Transcription & Recognition Platform — a Social Machine for Job Creation & Skill Development in the 21st Century!

Teaching Quantitative Methods: What Makes It Hard (in Literary Studies)

Introduction: This article reflects on the lessons learnt by the author as he first taught a graduate course in digital analysis of literary texts. He stresses the importance of methodologies over technologies, the need for well-curated, community-created teaching datasets and the implications of the practical, discipline-based organisation of the curricula.