FY 2011 Annual Report (MSWord) Feb 3, 2012

2011 Funding Deliverables [docx]

FY 2011 Project Proposal Suite

See the "Attach" tab, above, for the draft high-level CDI FY 2011 Science Plan synthesizing the ideas, below, as well as a concept map summarizing the current proposal suite's foci and relationships. The concept map discusses focus areas that may be generally accomplished within the scope of the "Community of Practice Facilitation" discussion below. A struggle with the proposal process has been the tension between the free floating activities of working groups with rather ill defined end goals and the need to plan funding with inherent goals and outcomes. The Guiding Principles and other new sections below are one method of trying to balance in this tension.

Guiding Principles

After several discussions amongst community members about scoping work in FY2011, a number of key guiding principles are emerging:

Community of Practice Facilitation

Earlier iterations of how to put the 2011 proposal together focused much more heavily on the production of specific tools and components for data integration. Discussions amongst various teams let the group much more in the direction of targeted facilitation of the Community of Practice Working Groups. This was described by one community member (paraphrased here) as giving teams with generally parallel goals and objectives a reason to get together, discover the interdependencies between their projects, and coming up with ways to make a stronger and more broadly applicable end product.

Examples of this include:

These concepts need to be more thoroughly fleshed into some set of discrete actions and funding amounts for the forthcoming proposal.

Documentation and Outreach

Peter Fox (RPI), who addressed the Community at the August workshop, made a statement about appreciating the open attitude of our USGS community but indicating the fact that he, as an outsider, was not able to collaborate in our online space. We took this to heart and would like to propose the following:

Publishing Products

The 2010 work produced several good products that are candidates for publishing into the open source marketplace. Publishing as such will a) provide an official outlet for the software and documentation and b) open the products up for collaboration by the broader community. Several community members participated in an exercise to rewrite a USGS policy that will support this type of software release, but there are still a number of challenges in putting open source publishing into practice (e.g., publishing processes, Center and Bureau approval, Fundamental Science Practices, etc.). It is proposed that some level of funding and energy be put toward releasing one or more products in 2011 through an appropriate open source venue, working through all of the necessary practices in the USGS. These practices would be codified and released through the CDI Web space (discussed above).

The "toolkit" items below were one attempt to discuss FY2011 work that focused more on producing tools and widgets than on facilitating community interactions. These concepts and associated goals are still valid and will be part of the overall work, but the focus for the proposal has shifted toward community of practice facilitation and accomplishment of these goals within that framework.

Data Integration Toolkit (DI-Toolkit)

This element will essentially begin building on FY2010 work on the Geo-Data Portal (GDP), the ArcGIS tools for data access, and other aspects of directly applying Web services access to data to produce an overall toolkit that can be plugged into various science questions in need of integrated data.

Possible Leadership Team Identified at the Workshop: Carma San Juan, Roland Viger, Bruce Jones, Mike McHale, Robin O'Malley

Major Elements

Continuation of FY2010 Work

Original Workshop Bullets (working points)

Data Services Toolkit (DS-Toolkit) a.k.a. Tech Stack Working Group

The wiki page listed above should be consulted for the most recent developments of this group. The following text reflects thoughts at the time of the workshop, which continue to evolve. In general, this element builds on the FY2010 work for the Data Uploader and the concepts discussed heavily in the workshop about a group of data hosting and serving technologies operating in different parts of the USGS. The hope is that all these developments can be coordinated to form a "Scientific Data Network." At the very least, this element seeks to develop a community of practice around the technology and design of such a network.

Scientists may opt to use a centralized incarnation of the Data Uploader or download and deploy a local version of this platform (this version is sometimes referred to as a "droppable appliance") within their project. In addition, the design of this platform and the community discussions around it will help to provide a template and guidance for projects that need to develop their own information management system ("appliance") and will hopefully result in a project-specialized system that can still participate in the greater Scientific Data Network.

An emphasis that has grown since the workshop that is worth mentioning here is that this community seeks to develop understanding and recommendations on how such appliances can be engineered to effectively exploit the many corporate/enterprise data services, such as those being pushed forward by the National Map and the National Water Information System. While the Data Uploader has consciously attempted to build around open-source software (in addition to open standards), many of these corporate services do leverage important proprietary components of the Agency's corporate computer model, such as ESRI ArcGIS Server.

Possible Leadership Team Identified at the Workshop - Blodgett, Viger, Dadisman, Kern, Gunther, Skinner, Hope/Tricomi, Greenlee

Major Elements

Continuation of FY2010 Work

Original Workshop Bullets

  1. Develop a Technology Stack Working Group
    1. Unified Access Framework for Gridded and Vector Data:  WMS, WCS, WFS, and OPeNDAP+CF - Rich Signell, Blodgett, Viger
    2. Discoverability and Interoperability of Web-based Data and Processing Services - Blodgett, Viger, Dadisman, Kern, Gunther, Skinner,
    3. Catalog Services Plan, including architecture, as well as search and harvesting functions. With metadata standards - Steve Richard## Web Application Integration Framework (Summary Notes Presented) - Matt Tricomi, Greg Smoczyk, Hollister
    4. Moved from DI Toolkit to Tech Stack Create USGS Corporate Services Integrated Roadmap for serving TNM, GloVIS, MRDATA, NWIS (streamflow and water quality), watershed boundaries at National scope with Statistical Capability (TNM, WBD, NWIS, MRDATA, GloVIS)
    5. Data/Application Publishing - Web Services Publishing Best Practices (TNM, WSWG) (Moving from DM to Tech Stack since about publishing tech services and to allow NGP to participate in one group - Discussed at 12/7 TS meeting)

Data Management Toolkit (DM-Toolkit)

The success of the fully applying the DI-Toolkit for scientific applications and using the DS-Toolkit for hosting and serving usable data will depend heavily on the organization's ability to effectively manage its data and associated documentation (metadata). (Reference the Greenlee User Story.) Building on the "PI-Toolkit" concepts discussed in relation to documenting data uploaded through the Uploader during the 2010 work, a much larger team met to discuss the overall dynamics of managing, documenting, and publishing data. This group examined social and organization dynamics inherent in the data management process and identified a number of areas for forward momentum in 2011.

Possible Leadership Team Identified at the Workshop - Heather Henkel, Viv H., Adrian, Huffine, Dadisman, Frame/Mancuso, Kase/Fornwall, Schweitzer, Govoni

Major Elements

Original Workshop Bullets

Existing Data Management Tools and Products as Practical Examples

Knowledge Management Framework (KMF) and Toolkit (KM-Toolkit)

Establishment of a formal, well-structured, and comprehensive Knowledge Management Framework (KMF) to organize, presrve, and share information and resources relevant to USGS scientific data integration and management – both resulting from, or referenced in the course of CDI projects – should be an essential component of the CDI effort.  Such a framework would provide a ready and essential base for the documentation of systems, tools, standards, processes, and practices which could be easily drawn upon in support of CDI-related community learning, training, technical support, and general communication efforts.

We propose that a workgroup be formed to address the need for creating and managing a sound KM Framework for the benefit of the CDI and its Science partners.

A Knowledge Management Framework page has been created to flesh out the initial ideas summarized here:

Information resources falling under this framework might include:

Tools to support KM would include:

Workgroup membership:  Dave Govoni (convenor), Richard Huffine (co-convenor)

Funding Considerations and Logistics

Research Resources


DBlodgett's high level observations from 9-2-2010 call:


Couple of comments/questions:


--rviger@usgs.gov, 14-Sep-2010