Web Documents Quality Assessment for Digital Humanities Scholars

Introduction by OpenMethods Editor (Joris van Zundert): Now that sources for research increasingly are digital sources, how do we establish the quality of such sources? Researchers from Amsterdam University and the Free University of Amsterdam propose a framework for quality assessment based on natural language processing techniques potentially reaching up to as many as 90% of articles accurate classified for quality.

We present a framework for assessing the quality of Web documents, and a baseline of three quality dimensions: trustworthiness, objectivity and basic scholarly quality. Assessing Web document quality is a “deep data” problem necessitating approaches to handle both data size and complexity. Traditional quality assessment methodologies are tailored to physical documents such as books, and qualitatively evaluate their authors and other metadata. These practices need to be extended to respond to the specific nature of online sources.

Original publication date: 2016.

Source: Towards Web Documents Quality Assessment for
Digital Humanities Scholars — WebSci’16

Author: Author on Source

Drs. Joris J. van Zundert (1972) is a senior researcher and developer in humanities computing. He holds a research position in the department of literary studies at the Huygens Institute for the History of The Netherlands, a research institute of The Netherlands Royal Academy of Arts and Sciences (KNAW). His main interest as a researcher and developer is in the possibilities of computational algorithms for the analysis of literary and historic texts, and the nature and properties of humanities information and data modeling. His current research focuses on computer science and humanities interaction and the tensions between hermeneutics and ‘big data’ approaches. View all posts by Author on Source

Share this:

Author: Author on Source