Open Access metadata: OAPEN partners with Penn State University Libraries 

Silke Davison

Mon 23 Oct 2023

Read this article at hypothè

This blog post was written in collaboration with Jeff Edmunds, Digital Access Coordinator at the Penn State University Libraries. 

For many years, OAPEN has made metadata describing the titles in its collection freely available for download on its website. 

But since OAPEN’s metadata originates with publishers, of which there are over four hundred included in the OAPEN Library, ensuring metadata quality and consistency is challenging. The MARCXML file that is available on the OAPEN site, for example, can be easily transformed into MARC – the metadata format favored by libraries worldwide – but the MARC data can be both error-ridden and incomplete. 

Many libraries therefore rely on metadata supplied by third party for-profit entities rather than acquiring the metadata from its source, OAPEN. 

To address this less-than-optimal situation, the Penn State University Libraries are collaborating with OAPEN to provide MARC records for the entire OAPEN corpus, freely available via Penn State’s institutional repository, ScholarSphere, and licensed for reuse under a CC BY 4.0 license. 

The Penn State Libraries download the MARCXML file of metadata from the OAPEN site once a month, transform the records into MARC, and run a series of scripts to clean and improve the data.  

“Cleaning the data, standardizing it, and making certain it complies with the MARC standard allows us to give back to the Open Access community. Libraries who want access to good quality MARC records for their Open Access titles now have an option to acquire them without cost and with no legal strings attached,” said Jeff Edmunds, Digital Access Coordinator at the Penn State University Libraries. 

For Open Access titles to be visible and findable in library discovery systems, good metadata is essential. Ideally, the metadata should also be open. 

“Our collaboration with Penn State allows us to provide metadata for OAPEN titles in a format most suited for easy absorption into library catalogs: MARC. What Jeff Edmunds and his colleagues are doing will allow libraries to acquire good quality MARC records for OAPEN titles that can be freely shared and manipulated, at no cost,” said Ronald Snijder, Chief Technology Officer and Head of Research at OAPEN. 

The OAPEN MARC records are available in ScholarSphere. 

The MARC files will be updated monthly to include records for the entire OAPEN corpus. 

“Our long-term goal is to create an infrastructure that allows the collaborative creation, maintenance, and sharing of open metadata for Open Access resources,” Snijder added. “This is a first step toward that ambitious goal.”