Hans Vraga from the Web Informatics and Mapping Program (WIM, wim.usgs.gov) gave an overview of the group, of which he is the Project Manager. WIM is a web development shop that has cooperators from both within and outside of the USGS. Some of their products include a SPARROW model output visualizer, StreamStats, and a WHISPers wildlife event reporting system (coming soon).
As you can imagine, their expertise is in high demand. Things they look for in cooperators include a match of scientific/subject matter expertise to complement their group’s technical expertise, the cooperator as an active product owner, focusing on development and minimizing time for operations, and fast turnaround time projects. Check out their website or contact Hans Vraga, Hans Wegmueller for more information.
In February, the group had two major questions come up for discussion - these were passed along to the appropriate committees and officials for guidance and answers were produced quickly!
First: Is there updated guidance on the volume of data necessary to trigger a separate data release? (As opposed to a table in a publication.) Short answer: Having the data in the paper is ok - however, if data is big enough to be moved into a supplemental section of the paper, it has to be a USGS data release.
Second: How should authors reference data that is not publicly available when writing a manuscript? Short answer: there is updated guidance on the FSP “Guide to Data Releases” page for data that are not available at the time of publication, or that have limited availability owing to restrictions, in the section Data Associated with a Publication.
John Stock @ of the USGS Innovation Center joined to talk about some opportunities available for postdoctoral research, future workshops, and future discussions related to AI/ML in the USGS. The joint USGS-NASA postdoctoral fellowships are now posted: https://geography.wr.usgs.gov/InnovationCenter/fellowship.html
Pete Doucette Doucette, Peter Joseph presented a talk “Ruminations on AI and Land Imaging.” He included a great intro on the difference between the AI and machine learning of decades ago versus the capabilities now (e.g. neural networks versus DEEP neural networks). Several land imaging projects and datasets at the USGS are becoming more “analysis-ready” for data science, predictive analytics, and to inform decisions. For example, see “Continuous change detection and classification of land cover using all available Landsat data.” Zhu and Woodcock 2014.
A major theme was the need for the combination of disciplinary expertise and AI/ML expertise, essentially team science, in order to reach the full potential of AI/ML. (See the NAS report Enhancing the Effectiveness of Team Science.)
A White House Fact Sheet on “Accelerating America’s Leadership in Artificial Intelligence” was shared with the group by Mona Khalil @mkhalil and Leah Colasuonno Colasuonno, Leah Taylor .
A few slides from Pete Doucette's talk on AI and Land Imaging.
Cassandra Ladino Ladino, Cassandra C. stepped in to lead the February Semantic Web Working Group discussion, which focused on the theme of FAIR (Findable, Accessible, Interoperable, Reusable) in USGS. The group discussed ideas for a proposed FAIR Workshop, including the topic of new approaches and technologies to further enhance FAIRness at USGS. See the meeting notes for more resources and references.
The joint ESIP Tech Dive - CDI Tech Stack presentation was on “Cloud Native Geoprocessing of Earth Observation Satellite Data with Pangeo,” by Scott Henderson, University of Washington. “The integration of new technologies with several high-level Python packages are enabling Cloud-native workflows and circumvent the bottleneck of downloading large amounts of data.”
Aptly summarized: “If that doesn’t get people excited I don’t know what will,” said Rich Signell Signell, Richard P. , co-chair of the Tech Stack Group.
Screenshot from a demo linked to the post "Cloud Native Geoprocessing of Earth Observation Satellite Data with Pangeo."
The latest monthly eDNA webinars organized by Scott Cornman Cornman, Robert S. was on CALeDNA (California Environmental DNA), by Rachel Meyer of UCLA. CALeDNA capitalizes on the enthusiasm of citizen scientists - they provide kits for collection of data in the field. Data collectors also take iNaturalist observations for benchmarking. The data are provided online for the public to identify patterns, and are also used for academic research on topics like phylogenetic diversity and functional diversity.
CALeDNA used the Kobo toolbox to build their data collection form, they found it to be the most robust platform for cell phone data collection. https://www.kobotoolbox.org/
rANACAPA - an R package developed so that non-specialists without community ecology background can generate the relevant plots. “Ranacapa: An R package and Shiny web app to explore environmental DNA data with exploratory statistics and interactive visualizations” https://f1000research.com/articles/7-1734/v1
Check out one of their case studies and the data visualizations available! https://data.ucedna.com/research_projects/pillar-point
A few slides from Rachel Meyer's talk on the California eDNA program.
The Software Development Cluster hosted a discussion on Cloud and Big Data in the Cloud. Cassandra Ladino started off the discussion with a presentation on Cloud and Big Data, including a summary of resources she has been using to learn more. There is information in the notes on how to sign up for a USGS Cloud Hosting Solutions Sandbox.