You'll probably want to join a new collaboration area after reading all of this exciting news. You can do that by following the instructions on this wiki page. You can also get to all CDI Collaboration Area wiki pages here. We recently added quick link buttons to meeting content to most collaboration area wiki pages.
The group had a conversation about the need for persistent unique identifiers in USGS metadata records that could be used across different government systems, including usgs.gov and data.gov. Lisa Zolly presented some slides frame the conversation, and take-aways are on the Metadata Reviewers Meetings Page.
February's DevOps meeting was like a Valentine's Day-themed love letter to the partnership between Dev and Ops. (Yes, that is a subjective opinion.) Andy Stauffer (Dev) and Robert Djurasaj (DevOps) combined to present on "Automating the Deployment of a National Geospatial Map Production Platform Using DevOps Workflows." A dedicated DevOps team was critical for scaling up workflow and infrastructure for the Map Production On Demand (POD) system. See the recording at the DevOps Meeting page.
Slide from the DevOps February meeting presentation.
The Data Management working group meeting used an interactive virtual format to have small group discussion about developing data management value propositions. Science Gateways Community Institute superstars Claire Stirm and Juliana Casavan presented the essence of value propositions - A clear understanding of the unique value your project delivers to your users or stakeholders. Virtual breakout groups held discussions and came up with many answers to "Why is CDI important for data managers?" Claire and Juliana's tips on value propositions included being succinct and developing different value propositions for different audiences. See the slides and the value propositions at the meeting notes page!
A general formula for developing a value proposition statement.
Mike Johnson from UC Santa Barbara presented on the Urban Flooding Open Knowledge Network. This is an exciting stakeholder-driven knowledge network project with emphasis on prototyping interfaces and web resources. See other joint CDI Tech Stack and ESIP IT&I webinars on the ESIP page.
Sophie Hou presented on "Choosing Usability Techniques." The process starts with establishing the context: what are you trying to learn and why? What are the gaps? What information do we need to learn and why? Next you should decide on the types of data needed: Attitudinal vs. Behavioral vs. Qualitative vs. Quantitative. Thank you to Sophie for continuing to build our knowledge about how to improve usability in our projects! See the slides and the notes on the Usability meeting page.
Slide from the Usability group's February presentation showing the differences between different types of usability data.
The Risk CoP hosted guest Scott Miles from Impact360 to kick off a special training series. Scott's presentation was loaded with information!
From the Risk Meetings page: This was the kickoff meeting for a series of training webinars provided by Impact360 Alliance on human-centered design thinking and inclusive problem solving. We met the Impact360 team and were introduced to the foundations and terminology of Impact360's toolkit, Toolkit360. Toolkit360 is a rigorous, intentional process to collectively amplify researcher and practitioner superpowers to integrate knowledge and action, unlike doing it “the way we’ve always done it” or working in silos. Toolkit360 fuses processes and methods from human-centered design, frame innovation,and systems theory (Panarchy). The Toolkit360 process uses 12 tools to bridge the problem and solution spaces with situation assessment, stakeholder alignment, problem framing, and prototyping. Future training webinars during the Risk COP's March and April monthly meetings will take deeper dives into these tools.
Slide from the Risk Community of Practice's February meeting.
Sheree Watson presented on Using Open Innovation to Engage Youth and Underserved Communities in the Pacific Islands. Sophia B. Liu followed with a discussion of why Open Innovation matters and how to participate in the USGS Open Innovation Strategy. See more ways to get involved in USGS Open Innovation at the Open Innovation wiki site!
Peter Claggett and Labeeb Ahmed presented on Mapping Channels & Floodplains: Hyper-Res Hydrography and FACET. FACET stands for Floodplain and Channel Evaluation Tool and more can be found at https://code.usgs.gov/water/facet.
Matt Baker (University of Maryland, Baltimore County) presented on 'Hyper'-resolution Geomorphic Hydrography: Methods, advantages, and shifting paradigm.
I always try to make myself look like I've been around for a long time with references to "do you remember when?" and I will do that now. I remember when the first LIDAR images started showing up at conferences and presenters would first show a standard (at the time) 30-meter DEM and then flip to the next slide that was LIDAR 1-meter resolution, and the whole audience would gasp in astonishment. Then this was followed by people saying "Well, none of the old tools work on LIDAR data, we have to build a whole bunch of new tools to analyze this higher resolution data." And that is what I thought of when watching these presentations. We've come a long way!
Access the recording at the Geomorphology Tools meeting page.
Slide from the Hyper-Resolution Geomorphic Hydrography presentation.
Jeremy Fee presented on Swagger and Micronaut. Jim Kreft showed the example of the Water Quality Data Portal at https://www.waterqualitydata.us/.
Learn more at:
See the recording at the Software Dev meeting page!
Jim Kreft demonstrated some of the details of the National Water Quality Data Portal.
Thanks to our collaboration area group leads, who organized topics and speakers ! Lightsom, Frances L. , Masaki, Derek , Hughes, David R. , Langseth, Madison Lee , Hutchison, Vivian B. , Blodgett, David L. , Signell, Richard P. , Hou, Chung Yi (Contractor) , Ludwig, Kristin A. , Emily Brooks , Ramsey, David W. , Liu, Sophia , Ladino, Cassandra C. , Guy, Michelle , Newson, Jeremy K.
You can add yourself to a CDI collaboration area by following the instructions on this wiki page.
Want to know more about When to do usability testing, How to include usability effectively, and How to foster a user-centered approach in a project? The January Usability collaboration area resource reviews address these topics. Sophie Hou Hou, Chung Yi (Contractor) will be posting resource reviews every-other month in order to help us build up our usability knowledge. Check out the usability resource reviews on our wiki here.
Sharing ownership of UX, from https://www.uxmatters.com/mt/archives/2007/05/sharing-ownership-of-ux.php
Lisa Zolly, Zolly, Lisa presented on "What's new with the DOI Tool?"
The USGS DOI tool allows USGS personnel to assign globally unique, persistent, and resolvable identifiers, registered with DataCite, to USGS-funded data products. Recent improvements to the tool address usability issues and compatibility with external systems such as CrossRef DOIs for related publications, DataCite, and ORCiD researcher IDs. DOI Tool FAQs link.
Find the slides, which clearly lay out the changes and the reasons for the changes, on the Metadata Reviewers Community of Practice meeting page.
The USGS DOI Tool allows for different levels of versioning of data releases.
Mike Giddens from Xentity presented "A Serverless Platform for Protected Areas of the U.S." He described a pilot project to explore a possible next generation of the PAD-US (Protected Areas of the U.S.) application in a "completely serverless architecture." The revised framework aims to allow for more efficient intake and validation of data, improved data quality, and capacity to publish new data products for biodiversity assessments and recreational applications. The presentation recording is available on the DevOps meetings page.
The current interface of the Protected Areas Database of the United States.
The group looked at the paper "SemantEco: a semantically powered modular architecture for integrating distributed environmental and ecological data." In addition, resources on the basics of the semantic web were shared on the topics of graph databases and ontologies and reasoning. See more at the Semantic Web meetings page.
Figure detailing a subset of the wildlife ontology from the SemantEco paper.
Heather Schreppel Schreppel, Heather and VeeAnn Cross Cross, VeeAnn A presented on COMPASS, the data inventory system for the USGS Coastal and Marine Hazards and Resources Program (CMHRP). COMPASS provides a centralized location to collect and preserve field activity information, track data, search for data within the system, facilitate field activity management, aid data managers in the preservation of scientific data, and maintain a comprehensive data catalog.
Colin Talbert Unknown User (firstname.lastname@example.org) presented on DataDash: Fort Collins Science Center’s Proposal, Project, Deliverable, and Archive Tracking Dashboard. DataDash links the Center's project records with several other USGS systems to improve project tracking.
The presentation slides and recording are available on the meeting wiki page.
Slide about the COMPASS data inventory system and its purpose.
Risk CoP summary provided by Kris Ludwig Ludwig, Kristin A. :
Applying Usability to Web-Tool Design and Development – Using HERA as a Use Case. Sophie Hou Hou, Chung Yi (Contractor) provided an overview of HERA (the USGS Hazard Exposure and Analytics Tool, https://www.usgs.gov/apps/hera/), the motivations for HERA to enhance its user interface and user experience, the usability strategy and techniques the HERA team has been using for the redesign, and recent results.
Update on the Strategic Hazard Identification and Risk Assessment (SHIRA) Project: Engaging Stakeholders in the Pacific Northwest. Emily Brooks Emily Brooks and Alice Pennaz Pennaz, Alice Bridget provided an update on the SHIRA project and shared themes from stakeholder interviews with emergency managers in National Parks and National Wildlife Refuges in Washington state.
The recording and slides are available on the Risk meetings page.
In the coming months during their regular meeting time, the Risk CoP will host a series of trainings from the Impact360 alliance "Elevating the interconnected nature of research and practice to reduce impacts of natural hazards and disasters." Learn more on the Risk wiki page.
Slide from the SHIRA - engaging stakeholders in the Pacific Northwest presentation.
Cassandra Ladino Ladino, Cassandra C. presented on Software Policies You Should Know, followed by a discussion of O365 Tips and Tricks. Ever wonder about FITARA, FedRAMP, GitLab and Federal Source Code Policy, Systems and Records Notices, the USGS Software Management Website, or your new O365 tools, this is the meeting for you. Recording available on the SoftwareDev meetings page.
And that's it for January!
You can add yourself to a CDI collaboration area by following the instructions on this wiki page.
The Geomorphology Tools group has been sharing methods that have been developed at different science centers in the USGS.
In December, GIS Tools to locate potential sediment sources in stream-channel networks was presented by Jen Cartwright of the Lower Mississippi-Gulf Water Science Center. More information about the tools is in the USGS publication: Automated identification of stream-channel geomorphic features from high‑resolution digital elevation models in West Tennessee watersheds.
Luke Sturtevant of the New England Water Science Center presented on Automated Hydro-enforcement, describing some tools that were developed to help with flood insurance studies. The tools use python and the Spatial Analyst toolbox in ArcGIS Pro to derive high resolution hydrologic models on a HUC8 scale from the most current lidar terrain at full resolution.
Image from Luke Sturtevant's presentation.
The group reviewed questions from their forum, which collects and documents Q&A within the metadata reviewers community. The forum currently has 29 topics, including specifics about data dictionaries, metadata editors, metadata training, and more!
The Data Management working group had two presentations:
How data flow from IPDS to CMS (the USGS Drupal web content management system), presented by Lance Everette, Web Reengineering Team (WRET). Lance described the data pipeline for data releases and also some details about special data such as provisional data.
Connecting USGS systems, presented by Drew Ignizio, Science Analytics and Synthesis. Drew described ongoing work that will inform how to structure information well when we collect it, what are connections to different tools, and how can data creation and formats be homogenized. This work will allow more efficient understanding of relationships between publications, data, authors, and USGS organizations.
An image from Drew Ignizio's presentation.
The Fire Science community of practice heard an update on the draft USGS Fire science Strategic Plan and Stakeholder Input, led by Paul Steblein and Mark Miller.
There was also a science talk: Aquatic Ecosystem Vulnerability to Fire and Climate Change in Alaskan Boreal Forests, given by Jeffrey Falke, Alaska Fish and Wildlife Coop Unit Leader.
Slide from Paul Steblein's presentation
The first meeting of the CDI Usability collaboration area was led by Sophie Hou. The collaboration area will alternate between two formats: live Town Hall meetings and posting of Resource Reviews. Sophie provided an overview of usability concepts including Definitions: Utility, Usability, Usefulness, User Interface (UI), User Experience (UX), Assessing Needs & Establishing Scope, Structuring Information, Creating Content and Features, and Testing & Validating User Feedback. Her presentation materials have many links to references and further information.
The group established a page to describe their usability interests and their willingness to participate in usability studies.
The key take-away: … usability is a necessary condition for survival..., it will help not only to gain users, but to develop partnerships.
Image from Sophie's presentation, retrieved from https://nimitmangal.wordpress.com/2013/09/19/what-is-usability/
What We Learned at the 2019 Society for Risk Analysis meeting - Kris Ludwig shared highlights from the recent Society for Risk Analysis conference.
Ecological risk assessment: demystifying our science to inform regulatory decisions - Jeff Steevens, Research Toxicologist, USGS Columbia Environmental Research Center (CERC) provided a brief overview of the science being developed by USGS CERC for aquatic ecological risk assessment. He shared a few real-world examples to show how a conceptual evaluation and a weight-of-evidence (WOE) approach can be used to integrate new techniques and data into the regulatory decision-making process. Jeff posed a couple of questions to the group, which we should revisit at a future meeting! 1) What methods/tools can we use to consolidate complex data so that it is not confusing to decision-maker? 2) How do we transition new tools/data within decision-making process?
This month's summary is provided by Kris Ludwig.
Image from Kris Ludwig's presentation.
The Risk Community of Practice is led by Kris Ludwig Ludwig, Kristin A. , Dave Ramsey Ramsey, David W. , and Emily Brooks Emily Brooks , and the meeting materials and recording can be found on their meeting wiki page (CDI member log-in required).
November's topic was "identifying different types of metadata." From the Metadata Reviewers Meeting page: …"metadata" is a word that refers to a variety of things. It would be good for USGS to develop accepted terms for different kinds of metadata. If the Metadata Reviewers Community agreed on those terms and their meanings, we could lead USGS just by the way we talk and write. Let's see what we can agree on!
Cassandra Ladino gave an overview USGS Software Policy, including topics such as policy that affects purchasing and management of computer technology, authorized cloud solutions, and authorization of software installations.
See more at the DevOps wiki page.
Natalya Rapstine presented on AI/ML in the USGS enabled by Tallgrass: classifying golden eagle behavior using telemetry. She used a recurrent autoencoder neural network to encode sequential California golden eagle telemetry data, followed by an unsupervised clustering technique, Deep Embedded Clustering (DEC).
Two slides from Natalya Rapstine's November AI/ML presentation.
See more at the AI/ML meetings page. The AI/ML group is going to take a two month break and the next meeting will be in February.
The Tech Stack group is hosting a series of linked data and spatial data integration talks. The discussion will be continued at the Earth Science Information Partners Winter Meeting coming up in January 2020. In November, Shane Crossman and Matt Purss presented on the Australian Location Index (LOC-I) http://locationindex.org/.
Frame from a video describing the Location Index project, which can be found at http://locationindex.org/.
CDI Tech Stack meetings are held jointly with the ESIP IT&I Tech Dive. See recording at the ESIP Tech Dive page.
The Data Management working group's "Speed data-ing" session at the CDI workshop revealed a challenge about bulk updating metadata, and also corresponding expertise from other members of the group. Greg Gunther presented the Data Management Challenge - Updating Thousands of Metadata Records. Then VeeAnn Cross & Peter Schweitzer presented on Bulk Updating Metadata Records in Practice in the Coastal and Marine Hazards group.
The recording is available at the Data Management WG meeting page.
Understanding the Value of Usability Evaluation. Mike Frame, Rachel Volentine, and Sophie Hou reviewed the general concepts of usability and user analysis, benefits seen by USGS, and current collaborations with the University of Tennessee User eXperience Laboratory. They also invited people to join the new Community for Data Integration Usability collaboration area.
After the Flames: The Science of Post Wildfire Response. Steve Sobieszczyk highlighted some of the requirements, tools, and activities tied into post-wildfire response and what it is like to deploy to an active fire incident as a burned area emergency response (BAER) team member.
Title slide from Steve Sobieszczyk's presentation.
See a more detailed meeting recap at the Risk CoP Forum.
November's monthly meeting had a visualization theme, building off a discussion that was first started at the CDI workshop this past summer. The presenters showed some examples of visualization tools in use at the USGS.
A poll showed some tools in use by the meeting attendees:
Lindsay Platt reminded us that there are many different formats and tools for data visualization, and we should think about the user needs before committing to a specific tool.
Laura DeCicco showed us some tools in R to develop visualizations directly from the data, such ass ggplot2, thus cutting down on mistakes in creating plots. She also shared many blog posts about other visualization techniques.
Rich Signell demoed HoloViz, "a coordinated effort to make browser-based data visualization in Python easier to use, easier to learn, and more powerful." The demo showed how compact code can create something almost like a dashboard with powerful interactive capabilities. The demo can be found at https://github.com/reproducible-notebooks/HoloViz-CDI-Demo.
Ben Letcher demoed dynamic visualization interfaces at ice.ecosheds.org. Multivariate filtering is one of the neat capabilities of the tool, which is built on Vue/vuex, D3, Leaflet, and Cross-filter. He would like to have more discussions about the role of data visualizations in decision making.
Dionne Zoanni gave an overview of the recent USGS Tableau workshop. She also shared several links to further resources for Tableau users in the USGS.
Finally, we unveiled the Community Voting Results from Phase 1 of the CDI Request for Proposals, showing widespread support for all of our 24 statements of interest.
See many more links and resources at the monthly meeting page!
In October, the DevOps group heard updates about the Federal DevOps Community of Practice (Brian Fox, General Services Administration), the DevOps Federal Interagency Council (Annette Mitchell, IRS), and DevOps in the Water Mission Area (Ivan Suftin, USGS).
See more at the DevOps wiki page.
A group discussing geomorphology tools and methods is now organized under the CDI wiki space. On October 1, Sinan Abood (USDA) presented on the USDA Forest Service National Riparian areas project (check out a storymap here). Greg Petrochenkov (USGS) and Fernando Aristizabal (NOAA) presented on Developing quick estimates of flood inundation for CONUS using an optimized GIS Flood Tool.
Image from National Riparian areas project storymap.
In a second meeting on October 31, Marina Metes presented on "Remotely detecting channel incision in headwater streams using lidar and topographic openness." She described a method to map reach-scale incision from lidar-derived digital elevation models using topographic openness, a landscape metric measuring the enclosure of an area (i.e. channel bottoms) relative to the surrounding landscape (i.e. stream banks). The method was validated with field surveys and local photogrammetric models of stream banks.
See notes and more information at the Geomorphology Tools Meetings Page.
The Metadata Reviewers group talked about the new Fundamental Science Practices guidance, a revision of Guidance on Documenting Revisions to USGS Scientific Digital Data Releases. There was a request for examples of revised data releases in ScienceBase and here are links to a few examples:
See more at the Metadata Reviewers Meeting page.
Hamed Alemohammad, Chief Data Scientist at Radiant Earth Foundation, presented on Radiant MLHub: A Repository for Machine Learning Ready Geospatial Training Data. Radiant Earth Foundation has established Radiant MLHub to foster sharing of geospatial training data for different thematic applications. Radiant MLHub is hosted on the cloud and users will be able to search for different training datasets, and quickly ingest them into their pipelines using an API. This presentation reviewed the architecture of Radiant MLHub, its API access, example applications for landcover classification from multi-spectral data and surface water detection from Synthetic Aperture (SAR) data.
Image from the Radiant Earth Foundation webpage.
See more at the AI/ML meetings page.
Dave Blodgett (USGS) provided an update on activities of the OGC Environmental Linked Features Interoperability Experiment (ELFIE) and a preview of activities of the Second ELFIE (SELFIE). The meeting was an interactive demonstration-based session focused on practical application of JSON-LD to link features and observations.
Image from the SELFIE webpage.
CDI Tech Stack meetings are held jointly with the ESIP IT&I Tech Dive. See recording at the ESIP Tech Dive page.
From the Risk Community of Practice meetings page:
We reviewed a few key announcements, including a preview to the FY20 Risk RFP Opportunity.
Following announcements, we heard from Carolyn Driedger, Hydrologist/PIO/Outreach Coordinator, Cascades Volcano Observatory and Tina Neal, HVO Scientist-in-Charge who discussed the process and results from a recent usability Study of the USGS Volcano HAzards Notification System (HANS). The Communications Work Group within the USGS Volcano Science Center began to write a 'Style Guide' for its HAzards Notification System (HANS) information products, which convey volcano status to the public and officials. As they wrote the style guide, they recognized the need for more formalized user input on which to base their guidance, both with formatting and wording. Carolyn shared some of the documents from this study and the group had a vibrant conversation about science communication following some questions she posed to everyone at the end of her talk.
See a more detailed meeting recap at the Risk CoP Forum.
Recommended links about October's Software Dev topics are the USGS include the Software Management Website, and the USGS Scientific Software IM. The group also heard short talks from two CDI statements of interest that were up for evaluation: Aaron Fox, USGS Cloud Environment Cookbook and Steve Fick (for Michael Duniway), So you want to build a web-tool?: Assessing successes, pitfalls, and lessons learned in an emerging frontier of scientific visualization.
Front page of the USGS Software Management website.
See more at the Software Development wiki page.
October’s CDI meeting explored current practices on some technical aspects of delivering scientific data and results.
David Hughes and Rob Djurasaj, from the CDI DevOps Community of Practice, presented on DevOps' role in data integration and delivery. In addition to an introduction to DevOps (the bringing together of software Development and Operations), they explained how certain new technologies are allowing the USGS NGTOC (National Geospatial Technical Operations Center) to improve their efficiency and save on costs. For example, efficiencies are gained by employing IaC, Infrastructure as Code, the process of managing and provisioning computer data centers through machine-readable definition files, rather than physical hardware configuration or interactive configuration tools.
Keith Maull and Matt Mayernik from the National Center for Atmospheric Research (NCAR) Library joined us to present on Packaging data and software. NCAR has been publishing reports since the 1960s, and increasingly both text and a code repository are part of the publication. How can they make the science in the publications reproducible for decades into the future? Some of their code can only run on supercomputers, and even tools like Jupyter notebooks sometimes do not fully run in new environments. They initiated discussion on optimal approaches for identifying, validating, characterizing, and preserving scientific information and tools.
Reproducibility considerations related to user needs and computational environments.
See the recording and slides at the meeting page.
The presentation topic was the USGS Bird Banding Lab ReportBand application.
The Bird Banding Laboratory (BBL) is an integrated scientific program established in 1920 supporting the collection, archiving, management and dissemination of information from banded and marked birds in North America. Their ReportBand application facilitates reporting of bird bands. It is a serverless application that utilizes an S3 Bucket configured to deliver a single page application running off Amazon Web Services Lambda functions using CloudFront and Route 53. The presentation covered the infrastructure-as-code CloudFormation templates stored in GitLab that configure and provision the application and backend in the Cloud Hosting Solutions environment. You can see the app at www.reportband.gov.
Image of Bird Bands from https://www.usgs.gov/centers/pwrc/science/about-federal-bird-bands
See more at the DevOps wiki page.
Karl Benedict (University of New Mexico) presented on the Data Management Training Clearinghouse (DMTC, https://dmtclearinghouse.esipfed.org/). The DMTC was initiated with CDI funds in FY2016, but has been supported by other funds since then, including DataONE, ESIP, and IMLS (Institute of Museum and Library Services). Current goals are to diversify the contents in discipline and target audience.
Madison Langseth presented on USGS-specific training materials and resources. These included trainings and resources on the USGS Data Management website. She also reviewed challenges from the CDI Workshop Speed Data-ing session, which included publishing large data sets that are user friendly, downloadable, & interactive; Publishing data dictionaries; Automating parts of data review tasks; Data management planning strategies relevant to scientists and data managers; Implementing records management with data management; Convincing scientists of the value of data management.
See the recording and slides on the meeting page.
Daryl Van Dyke, a spatial analyst for the Fish and Wildlife Service, presented on Utilizing Deep Neural Networks for Landscape Conservation: An Application of Google’s Tensorflow for a Cannabis Production Inventory in Northern California.
From the abstract: This presentation shows how an off-the-shelf Deep Neural Network (DNN) algorithm – Inception v2 – was retrained into a production classifier and applied to the problem of locating and sizing cannabis production on private lands in Trinity County.
Daryl included a vision where anyone doing a GIS project in the Department of the Interior would be required to submit bounding rectangles on a 10-meter grid for every single feature they worked with. This illustrates how defining standardized products at the Enterprise level would be powerful for things like creating a training dataset for landscape level classification.
A slide about the training dataset used to locate and size cannabis production on private lands in California.
See more information and the recording at their meeting page.
Michael Rilee, of Rilee Systems Technologies LLC presented about STARE (SpatioTemporal Adaptive Resolution Encoding for Scalable Integrative Analysis).
From the abstract: Aligning and integrating different kinds of Earth Science data is a laborious process, leading most researchers to focus on more generic, high level data products that are more easily compared. Dealing with the great volume and variety of Earth Science data is the goal of the NASA/ACCESS-17 STARE project. STARE is a unifying indexing scheme addressing variety and is well suited for applying distributed storage and computing resources to address volume.
Dealing with the great volume and variety of Earth Science data is the goal of the NASA/ACCESS-17 STARE project.
See the recording at the ESIP Tech Dive page.
From the Risk CoP meetings page:
We had two thought-provoking presentations that included insights from the economics and cultural anthropology fields.
The WiRē Team: a long-term research-practice collaboration for supporting wildfire adaptedness in Wildland-Urban Interface (WUI) communities. James Meldrum, Research Economist, Fort Collins Science Center, presented on the Wildfire Research (WiRē) Team and an innovative approach to uniting research and practice, collecting and using community-specific survey and risk assessment data to support local solutions to wildfire risk while also advancing the academic literature on the topic. See their website for more info: https://wildfireresearchcenter.org/.
Emily Brooks, USGS American Association for the Advancement of Science (AAAS) Science & Technology Policy Fellow, presented on Community-Centered Climate Planning for People and Parks. She described a new toolkit she developed with the National Park Service for community-centered climate change planning. Her talk included some hard but important lessons about the future of our cultural resources, including parks and historical sites, in the face of slow-onset disruptions related to climate change, as well as some useful tips for working with different stakeholders and communities.
The WiRē approach - see more at https://wildfireresearchcenter.org/approach.
The topic for September’s Software Development Cluster meeting was Databases, and Beyond…. Topics covered included the AWS Database Freedom Project, database scenarios, and an Overview of AWS databases. Recommended Video Resource on the AWS YouTube channel: How to Choose the Right Database for the Job
Image from Amazon Aurora intro video. Amazon Aurora is being tested for some USGS databases that experience high traffic spikes (for example, during natural disasters).
At the September 11, 2019 CDI Monthly Meeting, we heard about some ongoing projects in the Water Resources Mission Area.
Katie Skalak presented on the USGS Water Prediction Work Program (2WP). The Water Prediction Work Program (2WP) will take advantage of the USGS observational network and the wide body of process-based research to guide prediction of Earth surface processes that govern water resources and water quality. 2WP is very much aligned with our broader CDI priority of integrated predictive science capacity. Hundreds of people are involved, and the project aims for open science by design and building the integrated science culture. These concepts will help the team of teams to achieve simultaneous action and awareness.
Title slide from the 2WP presentation.
Roland Viger told us more about the National Hydrologic Geospatial Fabric: a framework for the integration of water information. NHGF is hydrologically informed information architecture for integrated science. Roland presented the concept as categorized by DNA, skeleton, and meat, which are related to hydroinformatics, information architecture, and high value data themes, respectively. The types of data and information involved include river corridor data, dynamic landscape characteristics, data models, data gathering, harmonization, and data integration.
River corridor information planned for the National Hydrologic Geospatial Fabric.
Being from a different USGS mission area and not having heard much about these before, the presentations were incredibly informative. Both of these projects are going to be integrating huge amounts of data and information, and I look forward to keeping up with their progress! If you have suggestions for other USGS initiatives that you’d like to learn more about at CDI Monthly meetings, let us know at email@example.com.
See the recording and slides at the September Monthly Meeting page.
Madison Langseth led a discussion about the proposed page on the Data Management Website about reviewing metadata. Details from the discussion can be seen on the group’s meeting notes page, and the Metadata Review webpage is now available and packed with useful information!
Creating a metadata record doesn’t end when the content is generated. Metadata review is an essential part of the process of creating a metadata record, and can involve finding a reviewer, checking technical and content aspects of the metadata, and communicating with the author. - from the USGS Data Management website
Screenshot from the USGS Metadata Review webpage.
Jim McAndrew from the NPS presented on National Park Service Vector Tiling activities. When web maps were first introduced, they relied on tiled raster data. But now vector data is used in the basemap itself instead of raster data. This allows web developers to access vector information in the basemap, and allows for custom on-the-fly styling. Jim discussed how the National Park Service made the switch to vector tiles, how they dealt with legacy applications, how they are making use of the newly available vector data, use in mobile apps, and future plans for combining information from other agencies in real time and displaying it in our web maps.
Check out the Web Mapping Tools for the NPS site.
Matthew Purss from Geoscience Australia presented on The Challenge of Location and How Discrete Global Grid Systems can enable Spatial Data Integration.
Existing approaches and disconnected infrastructures coupled with the myriad of ways to describe and store location information limit our ability to discover, access and integrate spatial data across organisation and jurisdiction boundaries to produce reliable and actionable information. The Location Index project (LOC-I) aims to introduce a consistent way to access, analyse and use location data to support the effective integration of socio-economic, statistical and environmental data from multiple data providers to support the spatially enabled delivery of Government policies and initiatives.
Bridging the vector/raster divide to enable data integration.
CDI Tech Stack meetings are held jointly with the ESIP IT&I Tech Dive series, See recording here.
The Data Management working group had two presentations at their August meeting:
USGS National Hydrography Data Management. Karen Adkins, Jerry Ornelas, and Lisa Kok presented an overview of USGS National Hydrography and related database management systems operations, including lessons in data management best practices.
Publishing bi-transect-extractor - A first experience publishing processing code with its derivative data. Emily Sturdivant presented on the challenges and opportunities when publishing a suite of related resources for a project: a methods open file report, a data release, software, and a journal publication, and getting them all linked together. See the officially released code at https://code.usgs.gov/cmgp/bi-transect-extractor
Linking software and data release (and journal articles and open file reports!)
The meeting recording is available at the DMWG August meeting page.
Daniel Buscombe from Northern Arizona University presented on Continuous streamflow and nearshore wave monitoring from time-lapse cameras using deep neural networks. He described a proof-of-concept study into designing and implementing a single deep learning framework that can be used for both stream gaging and wave gauging from appropriate time-series of imagery.
From sensor to decision maker for hydrodynamic monitoring.
More details and the recording are available on the AI/ML meetings page.
The August Risk Community of Practice meeting featured a summary of July workshop and two presentations.
Our speakers for today's meeting had a water theme, with Curt Storlazzi sharing some exciting work on analyzing the cost benefit analysis of coral reefs for reducing coastal hazards and Athena Clark sharing insights and actionable solutions on improving the display of USGS water data. Watch the recording to learn more! - Kris Ludwig
Rigorously Quantifying the Coastal Hazard Risk Reduction Provided by US Coral Reefs - Curt Storlazzi, Research Geologist, Pacific Coastal and Marine Science Center
I have an idea! Alternative ways of displaying our data - Athena Clark, Science Advisor, USGS Southeast Region and USGS Storm Team Lead
Title slide from Curt Storlazzi's presentation.
The recording and notes are available to CDI members at the Risk Meetings page.
The Software Development Cluster met in August to brainstorm about the CDI FY20 Request For Proposals! The group also led an excellent presentation about the role of software in USGS data integration to the CDI Monthly Meeting on August 14.
Find out more about the group and their activities at their wiki page.
Our August 14, 2019 Monthly Meeting featured an overviews of NASA EarthData and our own CDI Software Development Cluster.
Cynthia Hall from NASA gave an overview of some of the data types in NASA EarthData that overlap with our USGS Mission Areas. In addition, she presented some of the resources she developed to help data users navigate the system. There are toolkits (entry points to access NASA Earth science data) and pathfinders (designed to guide you through the process of selecting data products and learning how to use them). Both were developed with user input.
An image from Cynthia Hall's EarthData presentation.
CDI Software Development Cluster leads Jeremy Newson, Cassandra Ladino, and Michelle Guy were joined by Laura DeCicco and Emily Sturdivant to give an excellent overview that covered What is software? Who creates it? Why is it integral to data integration and delivery? What are some USGS Resources to improve software? What are some examples of USGS software for data integration and delivery?
A few relevant links that they presented:
What is software? Code? Applications? Answer: All of the above.
Software is important in many steps of data collection, analysis, and delivery.
See the archived slides and recording on the meeting page when you are logged in.
Peter Burkholder, a senior innovation specialist from 18F, was the guest presenter. 18F builds effective, user-centric digital services focused on the interaction between government and the people and businesses it serves. Peter is a DevOps engineer who has worked to develop cloud.gov and implement devops practices at 18F. He is also a geophysicist who previously worked at IRIS PASSCAL. His presentation covered best practices and technical implementation of automated infrastructure, resilient cloud operations, and continuous delivery pipelines.
Peter’s favorite 18F tools include
Viv Hutchison and Madison Langseth led a discussion that included a brief overview of the CDI DMWG session at the in-person meeting in June, data manager position descriptions for USGS, and contributed slides from working group members about data management staffing at their USGS science centers. In response to the poll question “If you consider yourself to be a data manager for your center, what is your current position description title?,” there were 19 different responses!
See slides and recording at the DMWG July meeting wiki page.
Josh Bradley and Dennis Walworth presented on the Open Source Metadata Toolkit, which was supported by the CDI from 2014-2015 (see project page on ScienceBase) and is still going strong!
The CDI Tech Stack group meets jointly with the ESIP IT&I group - access the slides and recording at the ESIP Tech Dive page.
Topics at the July meeting of the Fire Science Community of Practice (one of CDI’s newest collaboration areas). Mark Miller provided a short community update presentation. Josh Picotte gave a science talk describing the LANDFIRE remap effort that is currently underway. LF Remap is designed to produce vegetation and fuels data that inform wildland fire and ecological decision support systems. Sheila Murphy gave a second talk called "Arsenic and old mines - Wildfire remobilizes historical mining waste." Other relevant files from July are included on the meeting page, such as a Menlo Park lecture on USGS Fire Science that was given by Paul Steblein earlier in the month.
This month’s Fire Science CoP summary was provided by Mark Miller. See slides and other materials at the July 16 Fire Science Community of Practice Meeting page.
The Risk Community of Practice July meeting was "live" from the first Risk CoP meeting in Golden, CO. On July 18, 2019, after some brief announcements, the group heard short presentations from the PIs of the FY19 Risk CoP funded projects:
Title slide from Jaiswal, Nassar, et al. Risk project - Assessing the risk of global copper supply disruption from earthquakes.
Stakeholder engagement is an important piece of the USGS Risk Plan. But what does it mean to engage with stakeholders? What does co-production mean? What tools are used for engaging stakeholders and over what timelines during the course of a project? What types of challenges arise during stakeholder engagement? What are some of the surprising considerations to keep in mind while working with stakeholders?
This special session was live from the Risk CoP meeting in Golden, CO and featured a panel discussion on stakeholder engagement. Panelists answered the following questions: 1) What does stakeholder engagement mean to you? What does co-production mean to you? 2) When, during the course of a project, do you engage your stakeholders? 3) Describe three tools you use for engaging stakeholders? 4) Can you give an example of a challenge you have faced in doing stakeholder engagement and how you overcame these challenges? (Paperwork Reduction Act, protected information, confidentiality issues) 5) What are some surprising considerations to keep in mind when doing stakeholder engagement? (e.g., inclusivity, ethics, manner of approach).
This month’s Risk CoP summary was provided by Kris Ludwig.
Recordings are available on the Risk CoP meetings page (log in required).
Anyone following this blog may notice that I am making an effort to get up to speed to the present day, but am still a little bit behind. I still have great optimism about catching up, and these posts may help you reminisce about the summer.
At the July 10, 2019 CDI Monthly Meeting, we heard a proposal for ways to increase reusability of USGS datasets, and presentations from two map-based visualization and analysis tools. In addition, Kevin Gallagher reported on demographics, presentation materials, and take-aways from the CDI Workshop “From Big Data to Smart Data” that was held in June 2019 in Boulder, CO.
Responses to the CDI post-workshop survey showing the varied job descriptions in our community.
Richie Erickson presented a Scientist’s Challenge in exploring the use of Jupyter Notebooks to increase reusability of USGS datasets. He is focusing on smaller, project-level datasets that require explanation of disciplinary expertise and statistical analyses. To learn more, you can get in contact with Richie Erickson at firstname.lastname@example.org. See his slides here.
Image of the CDI-funded Online Landslide Inventory.
Ben Mirus’s presentation on a new national landslide inventory highlighted important considerations when integrating incomplete and disparate data. State boundaries often showed mismatches in data quantity or quality. Other topics of CDI interest included defining confidence metrics for the landslides, deciding on dataset update frequency, putting data releases through internal review, best practices for viewing heterogeneous data, identifying areas that need better data collection, and links from our science to governmental policy. Read more at Landslide Risks Highlighted in New Online Tool. This project is an FY18 CDI Funded Project, which more information at its ScienceBase page.
Example of US Topo map with National park boundary and water data.
Elizabeth McCartney and Greg Matthews’ presentation on the National Digital Trails Network showed a system that took existing trails and then uses an algorithm to identify and evaluate potential connections between trail systems using data like land type (owner), slope, and hydrography/river crossings. If you are interested in learning more you can contact the team at any of the following addresses: email@example.com, firstname.lastname@example.org, email@example.com.
The recording of the meeting is available at the monthly meeting page if you are signed in as a CDI member.
In addition to the meetings described below, several collaboration areas met during the face-to-face CDI Workshop in Boulder, CO, June 3-7!
Chris Gorgolewski presented on “Google Dataset Search: Facilitating data discovery in an open ecosystem."
Talk description: There are thousands of data repositories on the Web, providing access to millions of datasets. In this talk, I will discuss recently launched Google Dataset Search, which provides search capabilities over potentially all dataset repositories on the Web. I will talk about the open ecosystem for describing and citing datasets that we hope to encourage and the technical details on how we went about building Dataset Search. Finally, I will highlight research challenges in building a vibrant, heterogeneous, and open ecosystem where data becomes a first-class citizen.
Related links: https://toolbox.google.com/datasetsearch (Accessible when not signed in with a Dept of Interior Google account), https://www.blog.google/products/search/making-it-easier-discover-datasets/
Slide from Chris Gorgolewski's talk on Google Dataset Search.
The recording can be found on the ESIP Tech Dive meetings page. Dave Blodgett and Rich Signell are the group leads.
Pete Doucette provided a review of recent AI/ML-related Strategic Science Planning at USGS. This included thoughts captured from the recent USGS 21st Century Science Workshop (May 2019) at the National Conservation Training Center, and the CDI Workshop in Boulder, CO (June 2019).
The recording can be found on the AI/ML Meetings page. Pete Doucette and JC Nelson are the group leads.
The Semantic Web Working Group's June discussion centered on persistent identifiers for metadata records and vocabularies that are consistent with the FAIR principles. The group identified next steps on persistent identifiers for metadata records (could DataCite DOIs be used?) and next steps for achieving FAIR vocabularies (persistent identifiers for keywords, which is related to encouraging or requiring keywords that are from online vocabularies, and will be a step toward interoperability of vocabularies through use of ontologies.)
Text contributed by Fran Lightsom, SWWG lead! See more at the SWWG meeting notes page.
The group heard an engineer's perspective on risk from from Nico Luco who discussed the Earthquake Hazards Program's “Engineering and Risk” project, that contributes to delivering information for building codes and risk assessments. Next, Nate Wood provided an overview of the "Strategic Hazard Identification and Risk Assessment (SHIRA) on DOI Resources" Project, including an introduction to the DOI Risk Map, related data resources, and a relative threat matrix currently in development. This month’s summary is contributed by Risk CoP co-lead Kris Ludwig!
See more at the Risk Community of Practice Meetings page.
Related publication: Wood, N., Pennaz, A., Ludwig, K., Jones, J., Henry, K, Sherba, J., Ng, P., Marineau, J., and Juskie, J., 2019, Assessing hazards and risks at the Department of the Interior—A workshop report: U.S. Geological Survey Circular 1453, 42 p., https://doi.org/10.3133/cir1453.
The Software Development Cluster welcomed new cluster co-lead Jeremy Newson, and reminded participants that the USGS Software Management Website is up and running at https://www.usgs.gov/products/software/software-management/.
At the June meeting, the cluster reviewed the many related sessions at the CDI workshop, including Software Release Q&A, the Software Development Cluster Breakout Session, a Software Release Practicum, and a Software Birds-of-a-Feather Lunch. Discussions in those sessions included considerations and ideas for cross-USGS collaboration, institutional support, and software developer career paths at the USGS.
Some ideas and take-aways from the discussions include:
Full notes can be found at the workshop Slides, Recordings, and Notes page (if you log in as a CDI member). Cassandra Ladino, Michelle Guy, and Jeremy Newson are the cluster leads.
The May 8, 2019 CDI Monthly Meeting featured two CDI project teams and a presentation about NSF-funded lidar data management capabilities.
Hans Vraga presented on the motivation and technical details of an Ice Jam Hazard website and reporting system. The cloud-first system demonstrated use of the latest cloud technologies in a USGS mobile-friendly application. Hans is part of the Web Informatics and Mapping (WIM) team, that develops web-based tools that support USGS science and other federal science initiatives. You can see some of their other projects here: https://wim.usgs.gov/i/projects/
Jess Walker presented on her experience in developing a workflow for lidar processing and analysis in the cloud for USGS datasets. Working with the USGS Cloud Hosting Solutions team, she searched for solutions for processing and analyzing smaller-size (long-tail) lidar datasets using software like Entwine (https://entwine.io/) and Potree (http://potree.org/).
Chris Crosby from UNAVCO showed how OpenTopography (https://opentopography.org/) facilitates community access to high-resolution, Earth science-oriented, topography data, and related tools and resources. He also described upload and archiving for small to moderate sized topographic datasets in the Community Dataspace.
The OpenTopography Tool Registry provides a community populated clearinghouse of software, utilities, and tools oriented towards high-resolution topography data (e.g. collected with lidar technology) handling, processing, and analysis.