An understanding of metadata standards and profiles is critical to a successful distributed cataloging strategy that will result in useful client applications of the technology. CSW servers can return any number of metadata standards in a search or harvest operation, and applications written to use the CSW service methods need to "know" what they are getting in a few key areas. Some policies may need to be put in place across instances of the technology stack so that there are general rules of the road that everyone understands.
For instance, we may want to specify that every metadata record returned by CSW is identified with a universally unique identifier (UUID) so that search and harvest operations can tease out duplicates. In this same vein, we may want metadata from CSW servers to support some search criteria or server configuration that will serve only the unique records that call that particular instance "home" as opposed to other records that may be harvested onto that server. If vocabularies of terms are important to a particular collection of metadata and applications that use those metadata, we may want to specify that controlled vocabulary sources be identified to a particular registry in a particular way so that the authorities can be understood by any application.
All of these considerations may provide a useful focus for the Metadata Team.
1 Comment
Ierardi, Michael C.
I agree that the understanding of metadata, metadata standards, profiles and schemas is critical to a successful distributed cataloging strategy, but this understanding also needs to be applied to the existing metadata records being stored in a catalog.
Existing client applications of the technology (search engines) within catalog services (portals) for the web (CSW) identify records by classifications, themes, and content type. This is the mechanism that should be utilized to search and identify records of interest to do further processing.
Though every metadata record can be identified by a dynamically generated universally unique identifier (UUID), this record is nearly indecipherable and only as reliable as created from the original CSW or where that data originated or is being harvested from. When the host CSW fails, the UUID will have to be regenerated, which breaks the hierarchy of resources within those metadata files. This break affects the entire harvesting network chain, so a reharvesting up the chain, will need to occur from the host to your CSW. Working with UUID’s values is very cumbersome, values are inconsistent without alpha numeric patterns, a more simplified editing choice would be the resource classification indicated in the metadata.
The numerous resource classifications (you can make up your own) exist among the vocabularies (ontology) of terms which exist among catalogs. My concern is why some metadata has been released, which does not have a classification written in the metadata already.
Utilization standard classification resources had been designated and identified by the FGDC registry years ago, helping authorities easily classify their data/metadata by their optimum category/resource/mission. Listed below:
Applications
Clearinghouses
Documents
Downloadable Data
Geographic Activities
Geographic Services
Live Map Services
Map Files
Offline Data
Static Map Images
ISO Topic Category
Administrative and Political Boundaries
Agriculture and Farming
Atmosphere and Climatic
Biology and Ecology
Business and Economic
Cadastral
Cultural, Society and Demography
Elevation and Derived Products
Environment and Conservation
Facilities and Structures
Geological and Geophysical
Human Health and Disease
Imagery and Base Maps
Inland Water Resources
Locations and Geodetic Networks
Military
Oceans and Estuaries
Transportation Networks
Utilities and Communication
The solution to the fix, is to establish metadata "Stewarts", "metadata masters", "nodemasters"-among the Mission Areas, so those individual area records/metadata can be corrected before attempting to create a catalog. … Once a catalog exists among each area, migrating to a catalog of catalogs would be the next step. I see a link indicating a "metadata team" ... where does this team exist ??