The CDI working groups have been very active and I haven’t been able to attend every meeting. I’ll still try to keep track of their discussion topics here even if I couldn’t call in.
On February 8th, Steve Tessler presented “Using Microsoft Access for Data Processing: Managing ‘ELT’ with a Staging Area Approach.” I wasn’t able to attend this meeting, but the slides and meeting are posted in the DMWG space.
The slides explain more general concepts regardless of the Microsoft Access tool. Here, “ELT” stands for Extract, Transform, Load - one of the most common sets of activities in data processing.
Take-home message for repeatibility is to Save a process, not Intermediate data.
From the presentation: What you need to do when you process data:
Learn a Scripting Language, and knowing SQL is very useful
Create a sequence of process actions to do your work, & clean-up
Document each step of your data-handling as Process Metadata
Share the process, not the files
Also - USGS Public Access Plan for OSTP is posted! http://www.usgs.gov/quality_integrity/open_access/default.asp
The Semantic Web Working Group met on February 11 and discussed Using a Vocabulary Service for Secondary Metadata Validation: Peter Schweitzer has modified an online metadata validator to use the vocabulary services to check the accuracy of keywords in metadata fields.
Whenever I call into the SMWG I always mention how I’m just there to try to learn more because I’m really a novice when it comes to Semantic Web.
Some links noted down by this novice:
USGS Thesaurus (Science Topics): a browsable index of web resources specifically intended to help people outside USGS find information on USGS web sites without specific knowledge of the organizational structure and operations of the USGS
USGS Thesaurus Web Services: services that enable web pages to obtain detailed information about thesaurus terms for use in dynamic web interfaces
Terms from multiple thesauri: an example of a web page using the USGS Thesaurus Web Services.