The conversation below is a special, summer episode of our Spotlight series. It is a collaboration between OpenMethods and the Humanista podcast and this it comes as a podcast, in which Alíz Horváth, owner of the Humanista podcast series and proud Editorial Team member of OpenMethods, is asking Shih-Pei Chen, scholar and Digital Content Curator at the Max Plank Institute for the History of Science about the text analysis tools LoGaRT, RISE and SHINE; non-Latin scripted Digital Humanities, why local gazetteers are goldmines to Asian Studies, how digitization changes, broadens the kinds research questions one can study, where are the challenges in the access to cultural heritage and liaising with proprietary infrastructure providers… and many more! Enjoy!
https://openmethods.dariah.eu/2022/05/11/open-source-tool-allows-users-to-create-interactive-timelines-digital-humanities-at-a-state/ OpenMethods introduction to: Collaborative Digital Projects in the Undergraduate Humanities Classroom: Case Studies with Timeline JS 2022-05-11 07:28:36 Marinella Testori Blog post Creation Data Designing Digital Humanities English Methods…
Introduction: Awarded as Best Long Paper at the 2019 NACCL (North American Chapter of the Association for Computational Linguistics) Conference, the contribution by Jacob Devlin et al. provides an illustration of “BERT: Pre-training of Deep Biredictional Transformers for Language Understanding” (https://aclanthology.org/N19-1423/).
As highlighted by the authors in the abstract, BERT is a “new language representation model” and, in the past few years, it has become widespread in various NLP applications; for example, a project exploiting it is CamemBERT (https://camembert-model.fr/), regarding French.
In June 2021, a workshop organized by David Mimno, Melanie Walsh and Maria Antoniak (https://melaniewalsh.github.io/BERT-for-Humanists/workshop/) pointed out how to use BERT in projects related to digital humanities, in order to deal with word similarity and classification classification while relying on Phyton-based HuggingFace transformers library. (https://melaniewalsh.github.io/BERT-for-Humanists/tutorials/ ). A further advantage of this training resource is that it has been written with sensitivity towards the target audience in mind: in a way that it provides a gentle introduction to complexities of language models to scholars with education and background other than Computer Science.
Along with the Tutorials, the same blog includes Introductions about BERT in general and in its specific usage in a Google Colab notebook, as well as a constantly-updated bibliography and a glossary of the main terms (‘attention’, ‘Fine-Tune’, ‘GPU’, ‘Label’, ‘Task’, ‘Transformers’, ‘Token’, ‘Type’, ‘Vector’).
OpenMethods Spotlights showcase people and epistemic reflections behind Digital Humanities tools and methods. You can find here brief interviews with the creator(s) of the blogs or tools that are highlighted on OpenMethods to humanize and contextualize them. In the first episode, Alíz Horváth is talking with Hilde de Weerdt at Leiden University about MARKUS, a tool that offers offers a variety of functionalities for the markup, analysis, export, linking, and visualization of texts in multiple languages, with a special focus on Chinese and now Korean as well.
East Asian studies are still largely underrepresented in digital humanities. Part of the reason for this phenomenon is the relative lack of tools and methods which could be used smoothly with non-Latin scripts. MARKUS, developed by Brent Ho within the framework of the Communication and Empire: Chinese Empires in Comparative Perspective project led by Hilde de Weerdt at Leiden University, is a comprehensive tool which helps mitigate this issue. Selected as a runner up in the category “Best tool or suite of tools” in the DH2016 awards, MARKUS offers a variety of functionalities for the markup, analysis, export, linking, and visualization of texts in multiple languages, with a special focus on Chinese and now Korean as well.
Introduction: In this article, José Calvo Tello offers a methodological guide on data curation for creating literary corpus for quantitative analysis. This brief tutorial covers all stages of the curation and creation process and guides the reader towards practical cases from Hispanic literature. The author deals with every single step in the creation of a literary corpus for quantitative analysis: from digitization, metadata, automatic processes for cleaning and mining the texts, to licenses, publishing and achiving/long term preservation.
Introduction: In this post, you can find a thoughtful and encouraging selection and description of reading, writing and organizing tools. It guides you through a whole discovery-magamement-writing-publishing workflow from the creation of annotated bibliographies in Zotero, through a useful Markdown syntax cheat sheet to versioning, storage and backup strategies, and shows how everybody’s research can profit by open digital methods even without sophisticated technological skills. What I particularly like in Tomislav Medak’s approach is that all these tools, practices and tricks are filtered through and tested again his own everyday scholarly routine. It would make perfect sense to create a visualization from this inventory in a similar fashion to these workflows.
Introduction: This white paper is an outcome of a DH2019 workshop dedicated to foster closer collaboration among technology-oriented DH researchers and developers of tools to support Digital Humanities research. The paper briefly outlines the most pressing issues in their collaboration and addresses topics such as: good practices to ease mutual understanding between scholars and researchers; software development and academic career and recognition; or sustainability and funding.
Introduction: Sustainability questions such as how to maintain digital project outputs after the funding period, or how to keep aging code and infrastructure that are important for our research up-to-date are among the major challenges DH projects are facing today. This post gives us a sneak peek into the solutions and working practices from the Center for Digital Humanities at Princeton. In their approach to build capacity for sustaining DH projects and preserve access to data and software, they view projects as collaborative and process-based scholarship. Therefore, their focus is on implementing project management workflows and documentation tools that can be flexibly applied to projects of different scopes and sizes and also allow for further refinement in due case. By sharing these resources together with their real-life use cases in DH projects, their aim is to benefit other scholarly communities and sustain a broader conversation about these tricky issues.
Introduction: The world of R consists of innumerous packages. Most of them have very little download rates because they are limited to certain functions as part of a larger argument. Based on a surprising experience with the small package clipr Matthew Lincoln shares his thoughts about this reception phenomenon especially in the digital humanities.