Analysis

The Language Interpretability Tool: Extensible, Interactive Visualizations and Analysis for NLP Models

Posted on April 29, 2021April 29, 2021
by Erzsebet Tóth-Czifra

Introduction: NLP modelling and tasks performed by them are becoming an integral part of our daily realities (everyday or research). A central concern of NLP research is that for many of their users, these models still largely operate as black boxes with limited reflections on why the model makes certain predictions, how their usage is skewed towards certain content types, what are the underlying social, cultural biases etc. The open source Language Interoperability Tool aim to change this for the better and brings transparency to the visualization and understanding of NLP models. The pre-print describing the tool comes with rich documentation and description of the tool (including case studies of different kinds) and gives us an honest SWOT analysis of it.

Analysis

Programmable Corpora: Introducing DraCor, an Infrastructure for the Research on European Drama

Posted on February 8, 2021
by Marinella Testori

Introduction: The DraCor ecosystem encourages various approaches to the browsing and consultation of the data collected in the corpora, like those detailed in the Tools section: the Shiny DraCor app (https://shiny.dracor.org/), along with the SPARQL queries and the Easy Linavis interfaces (https://dracor.org/sparql and https://ezlinavis.dracor.org/ respectively). The project, thus, aims at creating a suitable digital environment for the development of an innovative way to approach literary corpora, potentially open to collaborations and interactions with other initiatives thanks to its ontology and Linked Open data-based nature.
[Click ‘Read more’ for the full post!]

Capture

Offen, vielfältig und kreativ. Ein Bericht zum Barcamp Data Literacy #dhddatcamp20 bei der DHd 2020 | DHd-Blog

Posted on November 26, 2020November 27, 2020
by Erzsebet Tóth-Czifra

Introduction: What are the essential data literacy skills data literacy skills in (Digital) Humanities? How good data management practices can be translated to humanities disciplines and how to engage more and more humanists in such conversations? Ulrike Wuttke’s reflections on the “Vermittlung von Data Literacy in den Geisteswissenschaften“ barcamp at the DHd 2020 conference does not only make us heartfelt nostalgic about scholarly meetings happening face to face but it also gives in-depth and contextualized insights regarding the questions above. The post comes with rich documentation (including links to the barcamp’s metapad, tweets, photos, follow-up posts) and is also serve as a guide for organizers of barcamps in the future.

Analysis

OpenMethods Spotlights #1: Interview with Hilde De Weerdt about MARKUS

Posted on October 13, 2020October 13, 2020
by Alíz Horváth

OpenMethods Spotlights showcase people and epistemic reflections behind Digital Humanities tools and methods. You can find here brief interviews with the creator(s) of the blogs or tools that are highlighted on OpenMethods to humanize and contextualize them. In the first episode, Alíz Horváth is talking with Hilde de Weerdt at Leiden University about MARKUS, a tool that offers offers a variety of functionalities for the markup, analysis, export, linking, and visualization of texts in multiple languages, with a special focus on Chinese and now Korean as well.

Analysis

MARKUS – Comprehensive tool with the needs of non-Latin script users in mind

Posted on October 11, 2020October 13, 2020
by Alíz Horváth

East Asian studies are still largely underrepresented in digital humanities. Part of the reason for this phenomenon is the relative lack of tools and methods which could be used smoothly with non-Latin scripts. MARKUS, developed by Brent Ho within the framework of the Communication and Empire: Chinese Empires in Comparative Perspective project led by Hilde de Weerdt at Leiden University, is a comprehensive tool which helps mitigate this issue. Selected as a runner up in the category “Best tool or suite of tools” in the DH2016 awards, MARKUS offers a variety of functionalities for the markup, analysis, export, linking, and visualization of texts in multiple languages, with a special focus on Chinese and now Korean as well.

Bibliographic Listings

GROBID: when data extraction becomes a suite

Posted on September 9, 2020September 28, 2020
by Marinella Testori

Introduction: GROBID is an already well-known open source tool in the field of Digital Humanities, originally built to extract and parse bibliographical metadata from scholarly works. The acronym stands for GeneRation Of BIbliographic Data.
Shaped by use cases and adoptions to a range of different DH and non-DH settings, the tool has been progressively evolved into a suite of technical features currently applied to various fields, like that of journals, dictionaries and archives.
[Click ‘Read more’ for the full post!]

Data

Das Projekt “GND für Kulturdaten” (GND4C)

Posted on September 1, 2020September 1, 2020
by Ulrike Wuttke

Introduction: Standardized metadata, linked meaningfully using semantic web technologies are prerequisites for cross-disciplinary Digital Humanities research as well as for FAIR data management. In this article from the Open Access Journal o-bib, members of the project „GND for Cultural Data“ (GND4C) describe how the Gemeinsame Normdatei (GND) (engl. Integrated Authority File), a widely accepted vocabulary for description and information retrieval in the library world is maintained by the German National Library and how it supports semantic interoperability and reuse of data. It also explores how the GND can be utilized and advanced collaboratively, integrating the perspectives of its multidisciplinary stakeholders, including the Digital Humanities. For background reading, the training resources „Controlled Vocabularies and SKOS“ (https://campus.dariah.eu/resource/controlled-vocabularies-and-skos) or „Formal Ontologies“ (https://campus.dariah.eu/resource/formal-ontologies-a-complete-novice-s-guide) are of interest.

Archiving

Exposing legacy project datasets in Digital Humanities | King’s Digital Lab

Posted on July 28, 2020July 29, 2020
by Erzsebet Tóth-Czifra

Introduction: Issues around sustaining digital project outputs after their funding period is a recurrent topic on OpenMethods. In this post, Arianna Ciula introduces the King’s Digital Lab’s solution, a workflow around their CKAN (Comprehensive Knowledge Archive Network) instance, and uncovers the many questions around not only maintaining a variety of legacy resources from long-running projects, but also opening them up for data re-use, verification and integration beyond siloed resources.

Analysis

Research COVID-19 with AVOBMAT

Posted on June 8, 2020June 16, 2020
by Christopher Nunn

Introduction: In our guidelines for nominating content, databases are explicitly excluded. However, this database is an exception, which is not due to the burning issue of COVID-19, but to its exemplary variety of digital humanities methods with which the data can be processed.AVOBMAT makes it possible to process 51,000 articles with almost every conceivable approach (Topic Modeling, Network Analysis, N-gram viewer, KWIC analyses, gender analyses, lexical diversity metrics, and so on) and is thus much more than just a simple database – rather, it is a welcome stage for the Who is Who (or What is What?) of OpenMethods.

Annotating

Ediarum. A toolbox for editors and developers | RIDE

Posted on April 14, 2020April 15, 2020
by Erzsebet Tóth-Czifra

Introduction: the RIDE journal (the Review Journal of the Institute for Documentology and Scholarly Editing) aims to offer a solution to current misalignments between scholarly workflows and their evaluation and provides a forum for the critical evaluation of the methodology of digital edition projects. This time, we have been cherry picking from their latest issue (Issue 11) dedicated to the evaluation and critical improvement of tools and environments.
Ediarum is a toolbox developed for editors by the TELOTA initiative at the BBAW in Berlin to generate and annotate TEI-XML Data in German language. In his review, Andreas Mertgens touches upon issues regarding methodology and implementation, use cases, deployment and learning curve, Open Source, sustainability and extensibility of the tool, user interaction and GUI and of course a rich functional overview.
[Click ‘Read more’ for the full post!]

OpenMethods

HIGHLIGHTING DIGITAL HUMANITIES METHODS AND TOOLS

Category: Data

The Language Interpretability Tool: Extensible, Interactive Visualizations and Analysis for NLP Models

GROBID: when data extraction becomes a suite

Das Projekt “GND für Kulturdaten” (GND4C)

Research COVID-19 with AVOBMAT

Ediarum. A toolbox for editors and developers | RIDE