A Scalable Framework for Dynamic Data Citation of Arbitrary Structured Data

S. Pröll, A. Rauber:
"A Scalable Framework for Dynamic Data Citation of Arbitrary Structured Data";
Vortrag: 3rd International Conference on Data Management Technologies and Applications (DATA 2014), Wien; 29.08.2014 - 31.08.2014; in:"International Conference on Data Management Technologies and Applications", SCITEPRESS Digital Library, (2014), ISBN: 978-989-758-035-2.

[ Publication Database ]

Abstract:


Sharing research data is becoming increasingly important as it enables peers to validate and reproduce data
driven experiments. Also exchanging data allows scientists to reuse data in different contexts and gather new
knowledge from available sources. But with increasing volume of data, researchers need to reference exact
versions of datasets. Until now access to research data often based on single archives of data files where
versioning and subsetting support is limited. In this paper we introduce a mechanism that allows researchers
to create versioned subsets of research data which can be cited and shared in a lightweight manner. We
demonstrate a prototype that supports researchers in creating subsets based on filtering and sorting source
data. These subsets can be cited for later reference and reuse. The system produces evidence that allows
users to verify the correctness and completeness of a subset based on cryptographic hashing. We describe a
replication scenario for enabling scalable data citation in dynamic contexts.