Digital Humanities

Which DH Tools Are Actually Used in Research?

Posted on February 12, 2020February 12, 2020
by Ulrike Wuttke

This short blog post by Laure Barbot, Frank Fischer, Yoan Moranville, and Ivan Pozdniakov from 2019 sheds some light on the old question which DH-tools are actually used in research and which are especially popular.

Code

Web Scraping with Python for Beginners | The Digital Orientalist

Posted on February 4, 2020February 5, 2020
by Christopher Nunn

Introduction: In this blog post, James Harry Morris introduces the method of web scraping. Step by step from the installation of the packages, readers are explained how they can extract relevant data from websites using only the Python programming language and convert it into a plain text file. Each step is presented transparently and comprehensibly, so that this article is a prime example of OpenMethods and gives readers the equipment they need to work with huge amounts of data that would no longer be possible manually.

Community Building

DH Research Software Engineers – For We Are Many

Posted on November 11, 2019November 14, 2019
by Erzsebet Tóth-Czifra

Introduction: This white paper is an outcome of a DH2019 workshop dedicated to foster closer collaboration among technology-oriented DH researchers and developers of tools to support Digital Humanities research. The paper briefly outlines the most pressing issues in their collaboration and addresses topics such as: good practices to ease mutual understanding between scholars and researchers; software development and academic career and recognition; or sustainability and funding.

Analysis

Analyzing Documents with TF-IDF | Programming Historian

Posted on September 15, 2019September 16, 2019
by Rombert Stapel

Introduction: The indispensable Programming Historian comes with an introduction to Term Frequency – Inverse Document Frequency (tf-idf) provided by Matthew J. Lavin. The procedure, concerned with specificity of terms in a document, has its origins in information retrieval, but can be applied as an exploratory tool, finding textual similarity, or as a pre-processing tool for machine learning. It is therefore not only useful for textual scholars, but also for historians working with large collections of text.

Digital Humanities

TEIdown: Uso de Markdown extendido para el marcado automático de documentos TEI

Posted on July 23, 2019July 24, 2019
by Gimena Del Rio

Introduction: In this article, Alejandro Bia Platas and Ramón P. Ñeco García introduce TEIdown, an extension of the Markdown syntax that aims at creating XML-TEI documents, and transformation programs. TEIdown helps editors to validate and find errors in TEI documents.

Artifacts

The Uncanny Valley and the Ghost in the Machine

Posted on June 28, 2019June 28, 2019
by Joris van Zundert

Introduction: There is a postulated level of anthropomorphism where people feel uncanny about the appearance of a robot. But what happens if digital facsimiles and online editions become nigh indistinguishable from the real, yet materially remaining so vastly different? How do we ethically provide access to the digital object without creating a blindspot and neglect for the real thing. A question that keeps digital librarian Dot Porter awake and which she ponders in this thoughtful contribution.

German

Modernes Tool für alte Texte

Posted on June 12, 2019June 17, 2019
by Stefan Karcher

Introduction: Computer scientists and humanists at the University of Würzburg have jointly developed a new and promising OCR tool to simplify text recognition in historical prints. “OCR4all” is freely available and works very reliably. The article describes its development and functions and leads to a well documented github repository to test the tool for yourself.

Data

Standardization Survival Kit – Create a dictionary in TEI

Posted on January 20, 2019March 4, 2019
by Erzsebet Tóth-Czifra

Introduction: Standards are best explained in real life use cases. The Parthenos Standardization Survival Kit is a collection of research use case scenarios illustrating best practices in Digital Humanities and Heritage research. It is designed to support researchers in selecting and using the appropriate standards for their particular disciplines and workflows. The latest addition to the SSK is a scenario for creating a born-digital dictionary in TEI.

Analysis

Towards Scientific Workflows and Computer Simulation as a Method in Digital Humanities – Digitale Bibliothek – Gesellschaft für Informatik e.V.

Posted on December 10, 2018January 20, 2019
by Erzsebet Tóth-Czifra

Introduction: The explore! project tests computer stimulation and text mining on autobiographic texts as well as the reusability of the approach in literary studies. To facilitate the application of the proposed method in broader context and to new research questions, the text analysis is performed by means of scientific workflows that allow for the documentation, automation, and modularization of the processing steps. By enabling the reuse of proven workflows, the goal of the project is to enhance the efficiency of data analysis in similar projects and further advance collaboration between computer scientists and digital humanists.

Content Analysis

Not All Character N-grams Are Created Equal: A Study in Authorship Attribution – ACL Anthology

Posted on November 19, 2018December 11, 2018
by Florian CAFIERO

Introduction: Studying n-grams of characters is today a classical choice in authorship attribution. If some discussion about the optimal length of these n-grams have been made, we have still have few clues about which specific type of n-grams are the most helpful in the process of efficiently identifying the author of a text. This paper partly fills that gap, by showing that most of the information gained from studying n-grams of characters comes from the affixes and punctuation.

OpenMethods

HIGHLIGHTING DIGITAL HUMANITIES METHODS AND TOOLS

Tag: digital humanities