Introduction: Named Entity Recognition (NER) is used to identify textual elements that gives things a name. In this study, four different NER tools are evaluated using a corpus of modern and classic fantasy or science fiction novels. Since NER tools have been created for the news domain, it is interesting to see how they perform in a totally different domain. The article comes with a very detailed methodological part and the accompanying dataset is also made available.
Introduction: Apart from its buoyant conclusion that authorship attribution methods are rather robust to noise (transcription errors) introduced by optical character recognition and handwritten text recognition, this article also offers a comprehensive read on the application of sophisticated computational techniques for testing and validation in a data curation process.
Introduction: Processing XML flows has sometimes been a complicated affair traditionally, and XProc was designed to standardise and simplify the process by using declarative XML pipelines to manage operations. This blog post by Gioele Barabucci presents conclusions from a meeting in late 2017 of the XProc 3.0 working group, exploring the latest emerging version of the standard and the kinds of challenges it will overcome.
Introduction: The article discusses how letters are being used across the disciplines, identifying similarities and differences in transcription, digitisation and annotation practices. It is based on a workshop held after the end of the project Digitising experiences of migration: the development of interconnected letters collections (DEM). The aims were to examine issues and challenges surrounding digitisation, build capacity relating to correspondence mark-up, and initiate the process of interconnecting resources to encourage cross-disciplinary research. Subsequent to the DEM project, TEI templates were developed for capturing information within and about migrant correspondence, and visualisation tools were trialled with metadata from a sample of letter collections. Additionally, as a demonstration of how the project’s outputs could be repurposed and expanded, the correspondence metadata that was collected for DEM was added to a more general correspondence project, Visual Correspondence.
Introduction: This post proposes the program and the video of a seminar on a software for 3D geographical data capture and visualization.
Introduction: This paper explores some the new toolchains offered by the Open Web Platform and alternatives to be considered in the daily editing workflows.
Introduction: This post highlights the current and future projects, methods and tools in digital Assyriology.
Introduction: This post highlights digital methods and standards for an efficient analysis of historical data.
Introduction: This article discusses the question of minimal sample size in stylometry setting it up as low as 2,000 words in some cases.
Introduction: NeMO is a conceptual framework for DH. It offers a well-founded conceptualization of scholarly work, which can function as schema for a knowledge base containing information on scholarly research activity, including goals, actors, methods, tools and resources involved.