Introduction: With Web archives becoming an increasingly more important resource for (humanities) researchers, it also becomes paramount to investigate and understand the ways in which such archives are being built and how to make the processes involved transparent. Emily Maemura, Nicholas Worby, Ian Milligan, and Christoph Becker report on the comparison of three use cases and suggest a framework to document Web archive provenance.
Introduction: Apart from its buoyant conclusion that authorship attribution methods are rather robust to noise (transcription errors) introduced by optical character recognition and handwritten text recognition, this article also offers a comprehensive read on the application of sophisticated computational techniques for testing and validation in a data curation process.
Introduction: The rperseus package provides classicists and other people interested in ancient philology and exegesis with corpora of texts from the ancient world (based on the Perseus Digital Library), combined with a toolkit designed to compare passages and selected words with parallels where the same expressions or words occur.
Introduction: Ecologists are much aided by historical sources of information on human-animal interaction. But how does one cope with the plethora of different descriptions for the same animal in the historic record? A Dutch research group reports on how to aggregate ‘Bunzings’, ‘Ullingen’, and ‘Eierdieven’ (‘Egg-thieves’) into a useful historical ecology knowledge base.
Introduction: Concepts are described differently in different times, and the way people talk about them reveals much about how people perceive these concepts. Researchers of the eScience Center in Amsterdam together with scholars from Utrecht University developed a visual tool to gain insight into such concept shift.
Introduction: The article discusses how letters are being used across the disciplines, identifying similarities and differences in transcription, digitisation and annotation practices. It is based on a workshop held after the end of the project Digitising experiences of migration: the development of interconnected letters collections (DEM). The aims were to examine issues and challenges surrounding digitisation, build capacity relating to correspondence mark-up, and initiate the process of interconnecting resources to encourage cross-disciplinary research. Subsequent to the DEM project, TEI templates were developed for capturing information within and about migrant correspondence, and visualisation tools were trialled with metadata from a sample of letter collections. Additionally, as a demonstration of how the project’s outputs could be repurposed and expanded, the correspondence metadata that was collected for DEM was added to a more general correspondence project, Visual Correspondence.
Introduction: This post explains the necessary lemmatization process for topic modelling on French or European texts with Mallet.
Introduction: The post discusses the challenges that traditional philological approach has to face in creating digital corpora of critical editions of nonvernacular medieval works.
Introduction: Now that sources for research increasingly are digital sources, how do we establish the quality of such sources?
Introduction: This article traces complex genealogy of distant reading to social-scientific approaches in literary studies.