Introduction: With Web archives becoming an increasingly more important resource for (humanities) researchers, it also becomes paramount to investigate and understand the ways in which such archives are being built and how to make the processes involved transparent. Emily Maemura, Nicholas Worby, Ian Milligan, and Christoph Becker report on the comparison of three use cases and suggest a framework to document Web archive provenance.
Introduction: This article proposes establishing a good collaboration between FactMiners and the Transkribus project that will help the Transkribus team to evolve the “sustainable virtuous” ecosystem they described as a Transcription & Recognition Platform — a Social Machine for Job Creation & Skill Development in the 21st Century!
Introduction: Apart from its buoyant conclusion that authorship attribution methods are rather robust to noise (transcription errors) introduced by optical character recognition and handwritten text recognition, this article also offers a comprehensive read on the application of sophisticated computational techniques for testing and validation in a data curation process.
Introduction: The rperseus package provides classicists and other people interested in ancient philology and exegesis with corpora of texts from the ancient world (based on the Perseus Digital Library), combined with a toolkit designed to compare passages and selected words with parallels where the same expressions or words occur.
Introduction: This article explains the concept, the uses and the procedural steps of text mining. It further provides information regarding available teaching courses and encourages readers to use the OpenMinTeD platform for the purpose.
Introduction: Processing XML flows has sometimes been a complicated affair traditionally, and XProc was designed to standardise and simplify the process by using declarative XML pipelines to manage operations. This blog post by Gioele Barabucci presents conclusions from a meeting in late 2017 of the XProc 3.0 working group, exploring the latest emerging version of the standard and the kinds of challenges it will overcome.
Introduction: This report (available in English, French, German, Polish and Spanish) summarizes the findings of a web-based survey conducted in 2014/2015 by the Digital Methods and Practices Observatory (DiMPO), a DARIAH working group
Introduction: The article discusses how letters are being used across the disciplines, identifying similarities and differences in transcription, digitisation and annotation practices. It is based on a workshop held after the end of the project Digitising experiences of migration: the development of interconnected letters collections (DEM). The aims were to examine issues and challenges surrounding digitisation, build capacity relating to correspondence mark-up, and initiate the process of interconnecting resources to encourage cross-disciplinary research. Subsequent to the DEM project, TEI templates were developed for capturing information within and about migrant correspondence, and visualisation tools were trialled with metadata from a sample of letter collections. Additionally, as a demonstration of how the project’s outputs could be repurposed and expanded, the correspondence metadata that was collected for DEM was added to a more general correspondence project, Visual Correspondence.
Introduction: In the context of medieval and early Tudor texts scholarship, this paper discusses the methodological use of the database not simply to store information, but to clarify points of tension between the questions asked and the information provided in order to find answers.
Introduction: How do we improve the quality of the fledgling practice of Web archeology, so much needed now that a first decade of Web information threatens to disappear as current interest wanes but contemporaneous cultural value is undisputed. A National Library of the Netherlands scientific report investigates.