Analysis

FactGrid – a database for historians

Posted on March 8, 2024March 11, 2024
by Françoise Gouzi

FactGrid is both a database as well as a wiki. This project operated by the Gotha Research Centre and the data lab of the University of Erfurt. It utilizes MediaWiki and a Wikidata’s “wikibase” extension to collect data from historic research. With FactGrid you can create a knowledge graph, giving information in triple statements. This knowledge graph can be asked with SPARQL. All data provided by FactGrid holds a CC0-license.

Analysis

“Multilingual Research Projects: Non-Latin Script Challenges for Making Use of Standards, Authority Files, and Character Recognition”.

Posted on July 8, 2023November 13, 2023
by Marinella Testori

Everyone of us is accustomed to reading academic contributions using the Latin alphabet, for which we have already standard characters and formats. But what about texts written in languages featuring different, ideographic-based alphabets (for example, Chinese and Japanese)? What kind of recognition techniques and metadata are necessary to adopt in order to represent them in a digital context?

Capture

Offen, vielfältig und kreativ. Ein Bericht zum Barcamp Data Literacy #dhddatcamp20 bei der DHd 2020 | DHd-Blog

Posted on November 26, 2020November 27, 2020
by Erzsebet Tóth-Czifra

Introduction: What are the essential data literacy skills data literacy skills in (Digital) Humanities? How good data management practices can be translated to humanities disciplines and how to engage more and more humanists in such conversations? Ulrike Wuttke’s reflections on the “Vermittlung von Data Literacy in den Geisteswissenschaften“ barcamp at the DHd 2020 conference does not only make us heartfelt nostalgic about scholarly meetings happening face to face but it also gives in-depth and contextualized insights regarding the questions above. The post comes with rich documentation (including links to the barcamp’s metapad, tweets, photos, follow-up posts) and is also serve as a guide for organizers of barcamps in the future.

Analysis

Evaluating named entity recognition tools for extracting social networks from novels

Posted on July 10, 2019July 10, 2019
by Klaus Thoden

Introduction: Named Entity Recognition (NER) is used to identify textual elements that gives things a name. In this study, four different NER tools are evaluated using a corpus of modern and classic fantasy or science fiction novels. Since NER tools have been created for the news domain, it is interesting to see how they perform in a totally different domain. The article comes with a very detailed methodological part and the accompanying dataset is also made available.

Analysis

Attributing Authorship in the Noisy Digitized Correspondence of Jacob and Wilhelm Grimm | Digital Humanities

Posted on June 11, 2018July 22, 2018
by Joris van Zundert

Introduction: Apart from its buoyant conclusion that authorship attribution methods are rather robust to noise (transcription errors) introduced by optical character recognition and handwritten text recognition, this article also offers a comprehensive read on the application of sophisticated computational techniques for testing and validation in a data curation process.

Conversion

XML Pipelines and XProc 3.0: Report of the WG Meeting in Aachen – CCeH

Posted on February 16, 2018February 16, 2018
by Paul Spence

Introduction: Processing XML flows has sometimes been a complicated affair traditionally, and XProc was designed to standardise and simplify the process by using declarative XML pipelines to manage operations. This blog post by Gioele Barabucci presents conclusions from a meeting in late 2017 of the XProc 3.0 working group, exploring the latest emerging version of the standard and the kinds of challenges it will overcome.

Analysis

The Migrant Letter Digitised: Visualising Metadata

Posted on November 10, 2017November 20, 2017
by Helen Katsiadakis

Introduction: The article discusses how letters are being used across the disciplines, identifying similarities and differences in transcription, digitisation and annotation practices. It is based on a workshop held after the end of the project Digitising experiences of migration: the development of interconnected letters collections (DEM). The aims were to examine issues and challenges surrounding digitisation, build capacity relating to correspondence mark-up, and initiate the process of interconnecting resources to encourage cross-disciplinary research. Subsequent to the DEM project, TEI templates were developed for capturing information within and about migrant correspondence, and visualisation tools were trialled with metadata from a sample of letter collections. Additionally, as a demonstration of how the project’s outputs could be repurposed and expanded, the correspondence metadata that was collected for DEM was added to a more general correspondence project, Visual Correspondence.

Analysis

3D Technologies in Digital Humanities and Spatial Sciences (30/06/2017)

Posted on October 12, 2017November 9, 2017
by Delphine Montoliu

Introduction: This post proposes the program and the video of a seminar on a software for 3D geographical data capture and visualization.

Analysis

Stretching the Boundaries of publishing

Posted on October 6, 2017November 9, 2017
by Helen Katsiadakis

Introduction: This paper explores some the new toolchains offered by the Open Web Platform and alternatives to be considered in the daily editing workflows.