Frequently Asked Questions
Pennsylvania Digital Library
Why a Pennsylvania OAI Metadata Repository?Who can participate?
Is there a charge to have your metadata harvested by PADL?
How is the service organized?
Who maintains PADL?
Who do I contact if I have questions related to PADL?
Registration
Who can register with the PADL?What do I need to register with PADL?
What is the difference between Index Methods?
What happens after I register?
How many sites can my institution register?
Harvesting
What is harvesting?How do I get my archive harvested in PADL?
What does PADL harvest?
How often does PADL harvest metadata from data providers?
How often is PADL refreshed?
What if my data does not display the way I expected?
What harvesting software does PADL use?
Access
How do users access the PADL?Who will be able to access PADL?
What about documents with restricted access?
Are there any copyright issues related to PADL?
Will you store copies of our documents on PADL?
OAI-PMH
What is OAI-PMH?How do I know if my repository is OAI-PMH compliant?
How can I make my repository OAI-PMH compliant?
Can anyone help me make my repository OAI-PMH compliant?
Where can I find more information on Open Archives and OAI-PMH?
Pennsylvania Digital Library
What is the Pennsylvania Digital Library?The Pennsylvania Digital Library is a state-wide metadata repository for digital resources created by Pennsylvania libraries, museums, and cultural heritage institutions. The University of Pittsburgh's Library System (ULS) has agreed to harvest, index, and provide a search interface for metadata created for Pennsylvania-created digital collections. This effort will be based on the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH).
Who can participate?
The Pennsylvania Digital Library will harvest metadata from all Pennsylvania libraries, museums, and cultural heritage institutions that expose metadata for non-commercial collections of electronic documents or other digital collections. This material may be digitized content (e.g., historical photographs, manuscripts, etc.) or born-digital content (e.g., electronic theses and dissertations, e-prints, etc.). The registry will not include licensed subscription-based content, such as commercial databases and full-text services. At this point, the repository is committed to harvesting metadata created and maintained by Pennsylvania libraries and cultural heritage institutions, but it could serve as a model for other states in the region.
Is there a charge to have your metadata harvested by PADL?
There is no fee to be harvested by PADL. The Padl search engine is also available to all users on the Internet, free of charge.
Who maintains PADL? PAdl was create and is hosted by the University Library System at the University of Pittsburgh.
How is the service organized? Pennsylvania libraries who create OAI-PMH harvestable metadata for their digital collections are the Data Providers. The University of Pittsburgh will harvest this data and act as the Service Provider. The Pennsylvania Advisory Committee on Collaborative Digitization (PACCD) is convening a metadata working group to act as liaison between the data providers and the service provider, as necessary. The final product is an OAI Registry specific to Pennsylvania with a Web-based search engine called the Pennsylvania Digital Library, also known as PADL.
Role of the Member Institutions (Data Providers):
- Ensure that their repository sites are OAI-compliant according to the OAI Implementation Guidelines.
- Register their OAI-compliant sites.
- Host content and determine access restrictions.
- Promote and evaluate the repository.
- Harvest and store OAI metadata exposed by the member sites on an ongoing basis.
- Provide a Web-based tool for member institutions to register their OAI-compliant sites.
- Provide a Web-based indexing and retrieval interface (search engine) to allow worldwide access to harvested metadata.
- Promote and evaluate the repository.
- Develop recommendations for data providers based on best practices and technical requirements.
- Act as liaison between Data Providers and Service provider.
- Promote and evaluate the repository.
Who do I contact if I have questions related to PADL? You should use the "Contact Us" link on the PADL Web site. Your question will be distributed to the PADL group and answered accordingly.
Registration
Who can register with the PADL? Pennsylvania libraries, museums, and cultural heritage institutions that have created sharable metadata using the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) may register with the Pennsylvania Digital Library (PADL). Currently, PADL is not harvesting metadata from institutions outside the state of Pennsylvania, regardless of content. PADL could serve as a model for other states in the region.What do I need to register with PADL? Registrans with digital collections need to create sharable metadata based on the OAI-PMH. It is recommended that participants validate and register as a Data Provider with the Open Archives Initiative before registering with PADL. Registering with the Open Archives will allow registrants to see how their data is displayed in the OAI-PMH before submitting records to PADL. Registrants will need to provide a collection title, collection URL, OAI base URL, administrator's email address, and index method.
What is the difference between the Index Methods? You may choose whether to harvest and OAI repository using the ListRecords or ListIdentifiers methods. ListRecords will generally be faster, but ListIdentifiers may be useful in some cases with repositories that are not 100% compatible.
What happens after I register? After staff at the University of Pittsburgh determine that you are a valid participant (see "Who can participate?"), your collection will be harvested and become part of the PADL. Within a week (i.e. the next scheduled harvest), you should see your collection in PADL. The next step is to verify that your data is displayed appropriately. If your data is not displaying properly, you should contact PADL and your digital archive or repository vendor.
How many sites can my institution register? There is no limit to the number of sites an institution can register providing that the OAI base URL is unique between all sites.
Harvesting
What is harvesting? In the OAI context, harvesting referes specifically to the gathering together of metadata from a number of distributed repositories into a combined data store. PADL is harvesting metadata from registered institutions to create an aggregated repository.How do I get my archive harvested in PADL? Pennsylvania libraries, museums, and cultural heritage institutions who have created harvestable metadata for their digital collection should register with the PADL. Metadata should conform to the OAI-PMH specifications. It is recommended that institutions who wish to participate in the PADL register as a data provider with the Open Archives Initiative. The OAI provides data providers with tools to test the conformance of their data.
What does PADL harvest? Only the metadata will be harvested. The digital objects will continue to be hosted at the member institution's site. Digital objects will be subject to any access restrictions set by the member institution. The PADL harvests metadata for non-commercial collections of electronic documents or other digital collections. This material may be digitized content (e.g., historical photographs, manuscripts, etc.) or born-digital content (e.g., electronic theses and dissertations, e-prints, etc.). The registry will not include licensed subscription-based content, such as commercial databases and full-text services.
How often does PADL harvest metadata from data providers? Metadata from registered sites will be harvested on a weekly basis. During the weekly harvest, we will update metadata for new and updated records only.
How often is PADL refreshed? The metadata in PADL will be refreshed on a quarterly basis. During a refresh, all metadata previously harvested from your site will be deleted and relaced with a snapshot of your current metadata. Any changes that you make to the way you expose your metadata using the OAI-PMH protocol will not be reflected for all of your records until the next quarterly refresh.
What if my data does not display the way I expected? The metadata display is based on the data that you have created. For example, if your data appears to be mislabeled, you may have some problems with metadata mapping. The first step would be to contact your digital archive or repository vendor. You can also contact PADL using the "Contact Us" link on the Web site. We may be able to help you trouble shoot some common problems, although we are not experts on all systems.
What harvesting software does PADL use? PADL uses the PKP Open Archives Harvester. It is a free metadata indexing system developed by the Public Knowledge Project through its federally funded efforts to expand and improve access to research. More information on the harvester software is available at the PKP site: http://pkp.sfu.ca/?q=harvester.
Access
How do users access PADL? Member institutions should provide their users with a link to the search interface on the ULS site. Users of the Pennsylvania Digital Library (PADL) will be able to search the metadata of all statewide digital collections that have registered and been harvested by the PADL. Users will then be directed, via a link, to the digital object (image, text, ETD, etc.) at the host institution's site. The search tool will increase exposure to Pennsylvania's digital resources and make them more discoverable on the open Web.Who will be able to access PADL? The PADL search engine is accessible to the world, free of charge, via the World Wide Web. There are no restrictions on access.
What about documents with restricted access? Users will navigate from PADL to the local repository or archive to retrieve material. Access is controlled by the owning repository and all local restrictions will apply.
Are there any copyright issues related to PADL? None, other than the restrictions set by each local institution when accessing their respective collections. Please consult the owning repository for information on copyright restrictions for individual materials.
Will you store copies of our documents on PADL? PADL is a metadata repository only. We provide links to documents stored in memeber institutions' archives, but we do not harvest and store the digital documents themselves.
Open Archives Initiative Protocol for Metadata Harvesting
What is OAI-PMH? The OAI-PMH is a low-barrier mechanism for repository interoperability. OAI-PMH has been widely adopted as an open standards approach to allow harvesting metadata for digital resources. The OAI-PMH is a set of rules governing communication between an archive or repository charing metadata (Data Provider) and a harvester (Service Provider). The Data Provider maintains one or more repositories (web servers) that support the OAI-PMH as a means of exposing metadata. A Service Provider issues OAI-PMH requests to Data Providers and uses the metadata as a basis for building value-added services. AService Provider in this manner is "harvesting" the metadata exposed by Data Providers. A variety of tools and tutorials on the OAI-PMH are available online.How do I know if my repository is OAI-PMH compliant?
How can I make my repository OAI-PMH compliant?
Can anyone help me make my repository OAI-PMH compliant? It is recommended that you contact your digital repository vendor with questions related to creating an OAI compliant repository. The Open Archives Initiative has an OAI validation tool. It is recommended that you validate and register your collection with the Open Archives Initiative before attempting to register with PADL. We've listed a few Web sites and tutorials below that may be of assistance.
Where can I find more information on Open Archives and OAI-PMH?
Some recommended OAI resources:
- Open Archives Initiative Homepage
- http://www.openarchives.org/
- The Open Archives Initiative Protocol for Metadata Harvesting
- http://www.openarchives.org/OAI/openarchivesprotocol.html
- Web Tutorial on OAI-PMH
- http://www.oaforum.org/tutorial/