The KPLEX project was created with a two-fold purpose: first, to expose potential areas of bias in big data research, and second, to do so using methods and challenges coming from a research community that has been relatively resistant to big data, namely the arts and...
The KPLEX project was created with a two-fold purpose: first, to expose potential areas of bias in big data research, and second, to do so using methods and challenges coming from a research community that has been relatively resistant to big data, namely the arts and humanities. The projectâ€™s founding supposition was that there are practical and cultural reasons why humanities research resists datafication, a process generally understood as the substitution of original state research objects and processes for digital, quantified or otherwise more structured streams of information. The projectâ€™s further assumption was that these very reasons for resistance could be instructive for the critical observation of big data research and innovation. To understand clearly the features of humanistic and cultural data, approaches, methodologies, institutions and challenges is to see the fault lines where datafication and algorithmic parsing may fail to deliver on what they promise, or may hide the very insight they propose to expose. As such, the aim of the KPLEX project has been to pinpoint areas where different research communitiesâ€™ understanding of what the creation of knowledge is and should be diverge, and, from this unique perspective, propose where further work can and should be done.
Although the KPLEX project had only a short duration (15 months), its results point toward a number of central issues and possible development avenues for a future of big data research that is socially aware and informed, but which also harnesses opportunities to explore new pathways to technical innovations. The challenges for the future of this research and for its exploitation will be to overcome the social and cultural barriers between the languages and practices not only of research communities, but also of the ICT industry and policy sectors. The KPLEX results point toward clear potential value in these areas, for the uptake of the results, their application to meet societal challenges, and for improving public knowledge and action. Such reuse, however, may take significant investment and time, so as to establish common vocabulary and overturn long-standing biases and power dynamics. The potential benefits, however, could be great, in terms of technical, social and cultural innovation.
The KPLEX project was originally structured around four key themes and the method by which the project sought to address these themes involved a large and multiperspectival information gathering exercise, covering four distinct surveys, tens of hours of interviews, and a data mining exercise. The groups from which participants were drawn included a range of perspectives from around the practice of big data research and analysis, including cultural heritage practitioners, researchers from humanities, science and engineering and technology developers.
The KPLEX project started with a three month period in which the project partners began to make intellectual preparations for the work. During this period, the recruitment process was completed and the kick-off meeting was held at the Trinity College Dublin Long Room Hub which was attended by all members of the KPLEX Project Management Board (PMB) and the projectâ€™s EU project officer in February 2017 (M2). The Trinity College Dublin Faculty of Arts, Humanities and Social Sciences Ethics Committee advised the KPLEX PMB on the projectâ€™s ethical issues, with special focus on privacy aspects and issues concerning the protection of data. Approval in principle was received in January 2017 (M1) with final approval received at the start of July 2017 (M7) upon review of the project surveys and questionnaires. Work towards the primary research objectives was conducted throughout months four to months fourteen. The completion of Milestone Two: First Knowledge Consolidation Point (M7) was marked with a face-to-face project meeting at DANS. A further face-to-face project meeting took place in Versailles in November 2017 (M11) to coincide with the projectâ€™s contribution to the Big Data Value Forum. The completion of Milestone Three: Final Knowledge Consolidation Point (M14) was marked with a Project Writing Sprint in Nenagh in February 2018. All three meetings were attended by representatives of each of the project partners.
The projectâ€™s Data Management Plan (DMP) was revised as an early requirement of the project. The DMP consists of a living document created in the templates within the \'DMPonline\' tool: part of the Open Research Data Pilot (ORDP) funded under Horizon 2020.
The KPLEX project team was recruited in such a way as to be an experiment and a case study in interdisciplinary, applied research with a foundation in the humanities. Each of the four partner research groups was drawn from a very different research community, with different fundamental expectations of and from the knowledge creation process. The team of four partners included research groups in both digital humanities and anthropology, a research data archive and an SME specialising in language technologies. This diversity was a strength of the project, but also a constant reminder of how challenging such cooperative work, across disciplines and sectors, can be.
The following details are taken from project D1.1 â€œFinal Report on the Exploitation, Translation and Reuse Potential for Project Resultsâ€
The most compelling outcomes of the project stand at the intersection of the perspectives and themes pursued by these individual work packages and tasks. These points, which cover a wide range if issues around the complexity of the phenomena represented in data and the potential biases inherent in the methods most commonly used to interrogate them, are listed below. The resonances between and across the perspectives mined by the project illuminate those areas where we can evidence fundamental challenges to big data research, or opportunities for innovative future activities. These topics will not be simple to pursue, since some of them are viewed by some key contributors as unnecessary barriers to technical progress. It is clear, however, that such inconvenient truths of big data research are beginning to have an undesirable societal impact, and the KPLEX conclusions, while requiring courage to implement, can provide a solid foundation point for addressing many of them. The eleven areas of further research as identified by the project are as follows:
Big data is ill-suited to representing complexity: the urge toward easy interrogability can often result in obscurity and user disempowerment.
Big data compromises rich information.
Standards are both useful and harmful.
The appearance of openness can be misleading.
Research based on big data is overly opportunistic.
How we talk about big data matters.
Big data research should be supported by a greater diversity in approaches.
Even big data research is about narrative, which has implications for how we should observe its objectivity or truth value.
The dark side of context: dark linking and de-anonymisation.
Organisational and professional practices.
Big data research and social confidence.
For further information regarding each of these eleven areas for further research listed below, please refer to D1.1 Final Report on the Exploitation, Translation and Reuse Potential for Project Results.
More info: https://kplex-project.eu/.