Getting the most out of our data! Benefits from the analysis of large-scale bibliographic metadata

Authors

  • Angela Vorndran Deutsche Nationalbibliothek

DOI:

https://doi.org/10.5282/o-bib/2018H4S166-180

Keywords:

metadata analysis, standard data enrichment, data reconciliation

Abstract

The German National Library (DNB) strives to make use of the large numbers of bibliographic records in culturegraph.org. More than 160 millions of records originating from German and Austrian regional library networks, the British National Bibliography and DNB may be used for data analyses, evaluation of connections and statistical analyses. This paper gives an overview of the central topics: On the one hand, the clustering of works to comprise different editions and translations of a work. Indexing and classification information as well as links to authority data can then be shared among the members of each cluster to achieve a surplus in standardization and subject indexing. On the other hand, external data can serve as sources for enrichment of bibliographic records. This is exemplified by matching data from the Open Researcher and Contributor ID (ORCID) with the Integrated Authority File (GND). Using bibliographic records from Culturegraph, persons are matched on the basis of their publications´ titles. Finally, a few statistical analyses of the aggregated data are presented.

References

- Gatenby, Janifer; Greene, Richard O.; Oskins, W. Michael u.a.: GLIMIR: Manifestation and Content Clustering within WorldCat, in: code{4}lib Journal 17 (2012), http://journal.code4lib.org/articles/6812, Stand: 23.11.2018.

- Geipel, Markus Michael; Böhme, Christophe; Hannemann, Jan: Metamorph: A Transformation Language for Semi-Structured Data, in: D-Lib Magazine 21 (5/6), 2015, https://doi.org/10.1045/may2015-boehme.

- Hickey, Thomas B.; Toves, Jenny: FRBR Work-Set Algorithm. Version 2.0, 2009, https://www.oclc.org/content/dam/research/activities/frbralgorithm/2009-08.pdf, Stand: 23.11.2018.

- IFLA: Functional Requirements for Bibliographic Records, Final Report, 1998, https://www.ifla.org/files/assets/cataloguing/frbr/frbr_2008.pdf, Stand: 23.11.2018.

- Pfeffer, Magnus: Using Clustering Across Union Catalogues to Enrich with Indexing Information, in: Spiliopoulou, Myra; Schmidt-Thieme, Lars; Janning, Ruth (Hg): Data Analysis, Machine Learning and Knowledge discovery, Cham 2014, S. 437–445.

- Pfeifer, Barbara; Polak-Bennemann, Renate: Zusammenführen was zusammengehört – Intellektuelle und automatische Erfassung von Werken nach RDA, in: o-bib. Das offene Bibliotheksjournal 3 (4), 2016, S. 144–155, https://doi.org/10.5282/o-bib/2016h4s144-155.

- Riva, Pat; Le Bœuf, Patrick; Žumer, Maja: IFLA Library Reference Model. A Conceptual Model for Bibliographic Information, 2017, https://www.ifla.org/files/assets/cataloguing/frbr-lrm/ifla-lrm-august-2017_rev201712.pdf, Stand: 23.11.2018.

- Wiesenmüller, Heidrun; Pfeffer, Magnus: Abgleichen, anreichern, verknüpfen, in: BuB 35 (9), 2013, S. 625–629. Online:http://www.b-u-b.de/pdfarchiv/Heft-BuB_09_2013.pdf, Stand: 23.11.2018.

Published

2018-12-10

Issue

Section

Conference proceedings

How to Cite

Vorndran, A. (2018). Getting the most out of our data! Benefits from the analysis of large-scale bibliographic metadata. O-Bib. Das Offene Bibliotheksjournal Herausgeber VDB, 5(4), 166-180. https://doi.org/10.5282/o-bib/2018H4S166-180