Introduction: The indispensable Programming Historian comes with an introduction to Term Frequency – Inverse Document Frequency (tf-idf) provided by Matthew J. Lavin. The procedure, concerned with specificity of terms in a document, has its origins in information retrieval, but can be applied as an exploratory tool, finding textual similarity, or as a pre-processing tool for machine learning. It is therefore not only useful for textual scholars, but also for historians working with large collections of text.
Introduction: This post explains the necessary lemmatization process for topic modelling on French or European texts with Mallet.
Introduction: The post discusses the challenges that traditional philological approach has to face in creating digital corpora of critical editions of nonvernacular medieval works.
Introduction: This article reflects on the lessons learnt by the author as he first taught a graduate course in digital analysis of literary texts. He stresses the importance of methodologies over technologies, the need for well-curated, community-created teaching datasets and the implications of the practical, discipline-based organisation of the curricula.
Introduction: This post proposes the program and the video of a seminar on a software for 3D geographical data capture and visualization.
Introduction: This publication demonstrates how a specific digital method is a real tool for digital researches and analysis.
Introduction: This US project proposes an interface for various analysis of scanned data and documents.
Introduction: This post highlights the analysis of illuminated manuscript in art history before and after digital methods and tools.
Introduction: Here is a 2014 conference report on digital paleography and big data of the past, on epigraphic paleography, and on Oriflamms and DigiPal projects.
Introduction: This post proposes reflexions on definitions of a digitized version of an artefact after a DH conference.