Getting started with OpenRefine – Digital Humanities 201

Getting started with OpenRefine – Digital Humanities 201

Introduction: Open Refine, the freely accessible successor to Google Refine, is an ideal tool for cleaning up data series and thus obtaining more sustainable results. Entries can be searched in alphabetical order or sorted by frequency, so that typing errors or slightly different variants can be easily found and adjusted. For example, with the help of the software, I discovered two such discrepancies in my Augustinian Correspondence Database, which I am now able to correct with one click in the programme. I was shown that I had noted “As a reference to Jerome’s letter it’s not counted” five times and “As a reference to Jerome’s letter, it’s not counted” three times. Consequently, if I searched the database for this expression, I would not see all the results. A second discrepancy was between the entry “continuing reference (marked by Nam)” and the entry “continuing reference (marked by nam)”. Thanks to Open Refine, such errors can be completely avoided in the future.

The tutorial by Miriam Posner is a useful introduction to come in touch with the software. However, the first step of the installation is already out of date. While version 3.1 was still the latest when the tutorial was published, it is now version 3.5.2. Under Windows, you can now distinguish between a version that requires Java and a version with embedded OpenJDK Java, which I found very pleasing.

If needed, there are links at the end of the tutorial to other introductions that go into more depth.

Digital scholarship workflows

Digital scholarship workflows

Introduction:  In this post, you can find a thoughtful and encouraging selection and description of reading, writing and organizing tools. It guides you through a whole discovery-magamement-writing-publishing workflow from the creation of annotated bibliographies in Zotero,  through a useful Markdown syntax cheat sheet  to versioning, storage and backup strategies, and shows how everybody’s research can profit by open digital methods even without sophisticated technological skills. What I particularly like in Tomislav Medak’s approach is that all these tools, practices and tricks are filtered through and tested again his own everyday scholarly routine. It would make perfect sense to create a visualization from this inventory in a similar fashion to these workflows.