At the October 10, 2018 CDI monthly meeting, we heard about ongoing projects that could help us with our spatial data workflow, share solutions for the challenges of integrating incomplete and disparate data, and allow us to test and use technologies for storing and managing large volumes of data.
First, Kevin Gallagher gave us a preview of the FY19 CDI Request for Proposals themes - Biosurveillance of emerging invasive species and health threats, building national datasets, reusing previously funded CDI outputs, and enabling FAIR (Findable, Accessible, Interoperable, Reusable) data. The official Request for Proposals was released the following week and you can see the details here: https://my.usgs.gov/confluence/display/cdi/2019+Proposals
The deadline for 2-page statements of interest is November 16, 2018!
Next, I had a brief Q&A with Sky Bristol about building a spatiotemporal feature registry. This is a concept about designing and building a system for usable and repeatable processes that use spatial features. Sky is looking for feedback on how such a system can be built broadly to benefit many people. I hope to have more Q&A with CDI members and their projects in the future!
Ben Mirus from the Geologic Hazards Science Center presented on Assembling a National Scale Map of Landslide Inventories from Incomplete and Disparate Spatial Data. From his presentation, some topics that came up to explore further with CDI are: figuring out what other types of disciplinary data have this type of incomplete and disparate data (for example, species occurrence), and what is the theory about quantitatively analyzing incomplete and disparate data (for example, a dataset that is a mix of point locations and polygons of landslide scars).
Previous landslide compilation.
Matt Davis, from the Advanced Research Computing group, presented on A Cost Effective Approach to Scientific Data Storage and Management: BlackPearl and Globus. This presentation was exciting because we often get questions about how we in the USGS are supposed to meet data release requirements, or even share within a group of researchers, large volumes of data. Here, large files >>10GB. Matt let us know that YES, there are new options for storing and managing large data that are available to USGS researchers now (in beta). To get started, contact email@example.com and tell the Advanced Research Computing team about your data needs.
An image from Matt Davis' presentation.