Skip to end of metadata
Go to start of metadata

Meeting summaries and links in reverse chronological order.

2019 November 12 - AI/ML in USGS enabled by Tallgrass: classifying golden eagle behavior using telemetry


Log in to access this month's recording and slides.


Abstract: This presentation provides an overview of how we use a recurrent autoencoder neural network to encode sequential Californiagolden eagle telemetry data. The encoding is followed by an unsupervisedclustering technique, Deep Embedded Clustering (DEC), to iteratively clusterthe data into a chosen number of behavior classes. We apply the method tosimulated movement data sets and telemetry data for a Golden Eagle. The DECachieves better unsupervised clustering accuracy scores for the simulated datasets as compared to the baseline K-means clustering result.

Speaker Bio: Natalya Rapstine is a Computer Scientist at Advanced ResearchComputing group, specializing in computational data science, statistics, andmachine learning applications for advancement of science at the U.S. GeologicalSurvey. She received a M.S. in Statistics from Colorado School of Mines.

2019 October 8 - Radiant MLHub: A Repository for Machine Learning Ready Geospatial Training Data

Abstract:  Satellite observations provide invaluable data across different spatio-temporal scales. These data enable us to build models for applications such as land cover classification, agricultural monitoring, surface water mapping, biodiversity monitoring, among others. Meanwhile, machine learning (ML) techniques can be utilized to advance these applications, and develop faster, more efficient and scalable models. These techniques learn from training datasets that are generated from image annotation or ground reference observations. However, to develop accurate ML-based models, and be able to validate their accuracy, we need to use benchmark training datasets that are representative of the diversity of the target variable, and openly accessible to all researchers and developers. 
       To address this requirement, Radiant Earth Foundation has established Radiant MLHub to foster sharing of geospatial training data for different thematic applications. Radiant MLHub is hosted on the cloud and users will be able to search for different training datasets, and quickly ingest them into their pipelines using an API. To increase interoperability of training datasets generated by different institutions, Radiant MLHub has adopted the SpatioTemporal Asset Catalog (STAC) as the standard for data cataloging. 
        In this presentation, I will review the architecture of Radiant MLHub, its API access and the STAC definition for training data. Next, I will present two applications on using ML models for LC classification from multi-spectral data and surface water detection from Synthetic Aperture (SAR) data. 

Speaker bio: Hamed Alemohammad is the Chief Data Scientist at Radiant Earth Foundation, leading the development of Radiant MLHub as an opensource cloud native commons for Machine Learning applications using Earth Observations. He has extensive expertise in machine learning, remote sensing and imagery techniques particularly in developing new algorithms for multi-spectral satellite and airborne based observations. He also serves as an elected member of the American Geophysical Union’s technical committee on remote sensing. Prior to joining Radiant Earth, he was a Research Scientist at Columbia University. Hamed received his PhD in Civil and Environmental Engineering from MIT in 2014.

2019 September 10 - Utilizing Deep Neural Networks for Landscape Conservation: An Application of Google’s Tensorflow for a Cannabis Production Inventory in Northern California

Abstract: Landscape classification is the task of using imagery to map defined features on the landscape.  As computer technology and data science methodology advances, new techniques for this problem emerge.  Modern machine learning (ML) utilizing neural networks (NN) – is becoming an industry-standard data science approach for a variety of applications.  In particular, procedures of analysis for the task of computer vision (CV) are particularly adept and well-understood) at the task of computer vision (CV).

However, current landscape classification necessarily exposes trade-offs between accuracy, spatial granularity, and resources required.  CV offers a unique combination of speed and accuracy, while still producing feature mappings rather than simple pixel classifications. Compared to other explicit feature extractors, such as object-based image analysis (OBIA), CV can be a cost-effective a powerful methodology for obtaining features from difficult-to-classify image domains.

This application shows how an off-the-shelf Deep Neural Network (DNN) algorithm – Inception  v2 – was retrained into a production classifier and applied to the problem of locating and sizing cannabis production on private lands in Trinity County.   This application demonstrates the strengths and limitations for applying this method at the landscape scale.

The presentation concludes with ‘next steps’ and identifies developing technologies and architectures that mitigate some of the limitations in the current application.

Speaker bio:  Daryl Van Dyke serves as the spatial analyst for the USFWS, in Science Applications and Strategic Habitat Conservation.  I have a interdisciplinary background, with a focus on community and environment as well as second BS and MS in Environmental Engineering.  My thesis focused on using two-dimensional hydrodynamics for fish passage culvert retrofit design. As a federal servant, and a programmer, I've looked at the developing technologies of LiDAR, Structure-from-Motion, and ML as pivotal to the task of landscape analysis and conservation design. Non-technical interests in federal service include integrating analytic workflows, encouraging cross-program collaboration, and building accountability and reproducible science in resource management.

2019 August 13 - Continuous streamflow and nearshore wave monitoring from time-lapse cameras using deep neural networks.

Abstract: The expense and logistics of monitoring streamflow (e.g. stage and discharge) and nearshore waves (e.g. height and period) using in situ instrumentation such as current meters, bubblers, pressure transducers, etc, limits the extent to which such important basic information can be acquired. Machine learning might offer a solution, if such information can be obtained remotely from time-lapse imagery using inexpensive consumable camera installations. To that end, I describe a proof-of-concept study into designing and implementing a single deep learning framework that can be used for both stream gaging and wave gauging from appropriate time-series of imagery. I show that it is possible to train the framework to estimate 1) stage and/or discharge from oblique imagery of streams at USGS gaging stations, using existing time-lapse camera infrastructure; and 2) nearshore wave height and period from oblique and rectified imagery from USGS Argus systems. This proof-of-concept technique is based on deep convolutional neural networks (CNNs), which are deep learning models for regression tasks based on automated image feature extraction. The stream/wave gauge model framework consists of an existing generic CNN model to extract features from imagery - called a ‘base model', with additional layers to distill the feature information into lower dimensional spaces, prevent overfitting, and a final layer of dense neurons to predict continuously varying quantities. Given enough training data, the model can generalize well to a site despite variation in, for example, lighting, weather, snow cover, vegetation, and any transient objects in the scene. This development might offer the potential to train models for imagery at sites based on short deployments of in situ instrumentation, especially useful for sites where instrumentation is difficult or expensive to maintain for long periods. This entirely data-driven technique, at least for now, must be trained separately for each site and quantity, so would be suitable for very long-term, site-specific estimation of wave or hydraulic parameters from stationary camera installations, subsequent to a training period. Further development might promote low-cost (or even hobbyist) hydrodynamic and hydraulic monitoring anywhere. 

2019 June 11 - Strategic Science Planning at USGS

  • Pete Doucette provided a review of recent Strategic Science Planning at USGS. This included thoughts captured from the 21st Century Science Update
    Workshop at NCTC, and the CDI Workshop in Boulder, CO, held May and June 2019.
  • Recording for 2019-06-11

2019 May 14 - XGBoost in Continuous Change Detection and Classification (CCDC); Deep learning to quantify benthic habitat

  • Announcements (Pete Doucette)
    • Pete and other members of an AI/ML focus group presented to the USGS Executive Leadership Team at the end of March. Associate Directors are enthusiastic about incorporating AI/ML into their mission area research.
    • Don't forget the CDI workshop is happening June 4-7, 2019 in Boulder Colorado, tomorrow, May 15 is the last day for registration. Virtual participation links will be posted on the Workshop Page by the week before the workshop.
    • The AI/ML group leads are still interested in collecting an inventory of your AI/ML projects to share with interested USGS leadership. You can fill out the form here.
  • XGBoost in Continuous Change Detection and Classification (CCDC) - Chris Barber, USGS EROS
  • Deep learning to quantify benthic habitat - Peter Esselman, USGS Great Lakes Science Center
  • Recording for 2019-05-14

2019 April 9 - no meeting

2019 March 12 - Infrastructure for Deep Learning at the USGS; AI for Ecosystem Services


2019 February 12 - Innovation Center Opportunities and AI and Land Imaging


2018 December 11 - Inaugural Meeting