Wrangling with Non-Standard Data

Status: Finished

First online: 12-06-2020

Updated: NA

Authors Eetu Mäkelä [1], Krista Lagus [1], Leo Lahti [2], Tanja Säily [1], Mikko Tolonen [1], Mika Hämäläinen [1], Samuli Kaislaniemi [3], and Terttu Nevalainen [1]

[1] HELDIG – Helsinki Centre for Digital Humanities, University of Helsinki, Finland

[2] Department of Future Technologies, University of Turku

[3] University of Eastern Finland ___

This study has been published in Proceedings of the Digital Humanities in the Nordic Countries 5th Conference

Abstract

Research in the digital humanities and computational social sciences requires overcoming complexity in research data, methodology, and research questions. In this article, we show through case studies of three different digital humanities and computational social science projects, that these problems are prevalent, multiform, as well as laborious to counter. Yet, without facilities for acknowledging, detecting, handling and correcting for such bias, any results based on the material will be faulty.

Therefore, we argue for the need for a wider recognition and acknowledgement of the problematic nature of many DH/CSS datasets, and correspondingly of the amount of work required to render such data usable for research. These arguments have implications both for evaluating feasibility and allocation of funding with respect to project proposals, but also in assigning academic value and credit to the labour of cleaning up and documenting datasets of interest.

Keywords

Complexity Data Issues Non-Standard Data Bias Interpretation Workflows

DOWNLOAD