CDI Conference Call - February 12, 2014
The Community for Data Integration (CDI) meetings are held the 2nd Wednesday of each month at 11:00 a.m. Eastern Time.
USGS/DOI Dial In Number: (703) 648-4848 (for USGS and DOI offices)
Toll Free Dial In Number: (855) 547-8255 (for other offices and telecommute locations)
Conference Code: 47919# (same for both numbers)
Open Data Initiatives and USGS Response
*This meeting represents the previously postponed Earth System Informatics Session of the 2013 CDI Webinar Series.
Background: On May 9, 2013, the White House released the Executive Order, “Making Open and Machine Readable the New Default for Government.” This Order built upon an earlier interagency memorandum from the Office of Science and Technology Policy (OSTP) and was accompanied by an Office of Management and Budget (OMB) Policy and a site hosted on GitHub, Project Open Data, to guide implementation. To meet the requirements outlined in these initiatives, USGS will undertake a variety of tasks, in concert with the Department of the Interior (DOI), data.gov and others. Through these presentations, CDI participants will learn about the broader policy, as well as the DOI and USGS responses and how to become involved.
Click link to access the Webex recording
Agenda (in Eastern time)
11:00a Opening Remarks [PDF] -- Open Data and CDI – Kevin T. Gallagher, Associate Director for Core Science Systems, USGS
11:15a Digital Government Strategy [PDF] - Rich Frazier, Deputy Associate Director or Nancy Sternberg, Chief, Information and Investment Management, Administration and Enterprise Information
11:35a USGS Policies and Activities Related to Open Government PDF] -- Carolyn Reid, Policy Analyst, Office of Science Quality and Integrity
11:55a USGS Web Reengineering [PDF] -- Karen Armstrong or Tim Woods, Office of Communications and Publishing
12:15p USGS Science Data Catalog [PDF] – Mike Frame, USGS Core Science Analytics and Synthesis
12:35p Open Data Q & A, Discussion
Opening Remarks - Kevin G.
- Ray Obuch – Is there going to be a push to build out in BASIS the ability to identify the cost of data management?
- Kevin- I have not heard yet of a proposal ensuring in BASIS these project costs are accounted for. It makes sense to do so and maybe should be discussed. A lot of what we want to achieve in DM represents a resource. We are doing a lot already but we should make sure we are coordinating our current efforts and we are being as efficient as we can be.
- Keith Kirk- I’ve noticed in talking with scientists that I see that we are not doing an adequate job getting the word out about these high level initiatives. I would suggest we start thinking about a better communication strategy to get information out to scientists.
- Kevin – Agree that communication should never stop. We have communications office representatives. I think we need them from top down, peer to peer, down to up. We do have a Science Data Coordinator Network to help us communicate to the center levels.
Digital Government Strategy - Nancy S.
- Keith K. – Are you coordinating with the science publishing network and helping them facilitate some change within their processes in order to bring those into the requirements for the open data initiatives.
- Nancy - Yes we are involved. Please see Carolyn’s presentation.
USGS Web Reengineering - Tim Woods
- Fran Lightsom – Could you say a little more about what refining data labels mean? Who is working on that?
- Tim – I believe it was a CDI group working on the various definitions and glossary of what data in USGS is. We want to get hold of the information and use the appropriate labels that will be discussed in the Science Data Catalog.
- Sofia Lui - One of the challenges in developing a crisis crowd sourcing application is to have a public facing prototype and to be able to do usability testing. Thoughts on that issue?
- Tim – It is tough to do development work and get it to people not behind a fire wall. We have recycled a sub-domain website so we can put prototypes outside of USGS. If you contact the natweb staff, there might be sub-domains available so you can do testing.
- Ray O. – There are going to be more requirements for increased hosting services for datasets. Programs may want to move datasets out into a central site. Would your group lead the charge in providing hosting services?
- Tim – No but I advocate people go to the Data Management website, there are a host of catalog tools for capturing high level metadata. As far as what should people do to moving to centralized systems, it’s the question of what is the USGS plan with the DOI cloud contract. They should talk to AEI or go to the DOI website.
- Sophia - What is the progress of the cloud contract?
- Tim - It has been awarded. The first step is for people to go to the cloud.doi.gov. There is a series of questionnaires and services available.
- Nancy – We are trying to get organized around the cloud. If you are interested, there are two procurement individuals at the Department focusing on acquisition. In USGS there are 8 active activities trying to leverage one of these cloud contracts available through DOI.
- Ben – Maybe as that process firms up, this would be a good CDI focused meeting.
USGS Science Data Catalog - Mike Frame
- Nancy Richie – Will you share your draft document? We are trying to integrate this across NOAA and this would be helpful.
- Mike- The intent of that document was to help drive what should be included in this Catalog. There is a piece focusing on metadata and completeness. I am happy to share that
- Annie Simpson – Is there any way I can improve the metadata in the Science Data Catalog?
- Mike – We created a scorecard to help providers understand how to add additional information metadata. We are going through records for especially those featured datasets and intend to work with providers to make sure the metadata will be as robust as possible.
- Cassandra Ladino – Does the Science Data Catalog automatically pull data from ScienceBase? What’s the relationship between the two.
- Mike – No data in the catalog are pulled from ScienceBase automatically unless the provider wants to make their metadata available to the catalog through ScienceBase. We are trying to be very considerate and cautious of not just harvesting across the USGS web. We want a conscious effort by our providers to direct what metadata they want to make available through the catalog by using the Dashboard.
- Ray- This catalog, will it accommodate data for non-scientific data to answer the omb 13-13 requirement (administrative data)? So some of these data might not be public
- Mike - We have to figure that out actually. Some of the administrative data is a different beast than the science metadata. We need to try to make it as integrative as possible but the two data are very different
- Cheryl – The emphasis is that the focus right now is science data. Right now we need to focus on making our science data available through this catalog.
- Keith K- I have a lot of folks who would like to use the data catalog. Which link do I use?
- Mike- We have done no announcements no press releases yet. One of the major pieces we have to work with is delivering metadata to the Department of Interior. We have to register USGS metadata up to DOI who then delivers on to OMB and data.gov The other piece that we have a requirement is to register the usgs data up to doi. Workflow. Try to leave it out to the feedback . but there has been no linkages. Until we work through the group.
- Liz sellers- It may be worth mentioning the difference between data.usgs.gov/datacatalog and data.usgs.gov/catalog
- Mike – Several years ago, we established a domain for data.usgs.gov before the open data initiatives. We have a placeholder and we have been developing the ScienceBase and Clearinghouse infrastructure. The Science Data Catalog is a combination between those two pieces. We are going to combine these two links into one soon.
- Linda – How do we learn the process for getting metadata into the Science Data Catalog.
- Mike - There is this best practice guide that is being developed. Part of this is the outreach and communications. We will set up webex to help people walk then through the registration process.
- Helen Tong - Whats the relationship between data.gov and data.usgs.gov?
- Mike – The data.usgs.gov is our metadata in USGS. In our case DOI has to establish data domains for each of their agencies. The concept is that USGS registers their metadata to DOI. Other agencies under the department do the same. And then DOI will serve up the metadata to data.gov which is an aggregator of data across ALL agencies. This is the idea of how is will work out and avoid duplication of metadata.
- Heather – For those of us who already have links on our websites that point to our data holdings, will this WRET effort modify navigation items? If we are going to a standard template for the new USGS and there is a link for data but we want to be able to link to our own data and websites, how is that going to be handled?
- Tim - The way that more websites will move into the new USGS web platform is still under discussion. We are going through an extensive information architecture process.
- Karen Armstrong – We are going to be going out and talking with Science Center to help figure out how they fit into this information architecture and will find out what is the best process for that.
- Cheryl – As a last remark, we really need the CDI engagement and input to help us make sure we are going in the right direction for all of these activities.