Skip to end of metadata
Go to start of metadata

CDI Conference Call - October 14, 2015

The Community for Data Integration (CDI) meetings are held the 2nd Wednesday of each month from 11:00 a.m. to 12:30 p.m. Eastern Time.

USGS/DOI Dial In Number: (703) 648-4848 (for USGS and DOI offices)
Toll Free Dial In Number: (855) 547-8255 (for other offices and telecommute locations)
Conference Code: 47919# (same for both numbers)

Webex Recording

Webex recordings are available to CDI Members. Please login to view the recording. If you would like to become a member of CDI, please email

Agenda (in Eastern time)

11:00a Welcome - Kevin Gallagher, Associate Director for Core Science Systems and Tim Quinn, Acting Chief for Office of Enterprise Information

11:10a Public Lab- Mathew Lippincott, Public Laboratory for Open Technology and Science


Presentation: Slides are available to CDI Members. Please login to download the slides. If you would like to become a member of CDI, please email



11:25a Citizen Sensing and the Problems and Practices of Citizen-Gathered Data - Jennifer Gabrys, Goldsmiths, University of London

Presentation: Slides are available to CDI Members. Please login to download the slides. If you would like to become a member of CDI, please email


The “Citizen Sense” research project investigates the rise of low-cost sensors used by citizens to monitor environments. Air pollution monitoring is one area in which there has been considerable development of citizen-sensing technologies. Through a set of participatory design and practice-led research methods, Citizen Sense has worked with a community in northeastern Pennsylvania, USA, to test citizen-sensing technologies for monitoring air pollution in relation to unconventional natural-gas production. Yet throughout the course of the project, numerous questions have emerged about what counts as “hard data,” whether or how low-cost devices might provide data of relevance in comparison to reference monitors, what sort of protocols might need to be in place to ensure the accuracy of data, and who is able to make claims about what citizen-gathered data demonstrates. This presentation will address the ways in which citizen-collected and citizen-sensed air quality data might make a greater contribution to environmental understanding and research.

11:40a Water Canary - Sonaar Luthra, Water Canary

Presentation: Slides are available to CDI Members. Please login to download the slides. If you would like to become a member of CDI, please email

11:55a Question & Answer Session

12:10p CDI IdeaLab Tour

Presentation: Slides are available to CDI Members. Please login to download the slides. If you would like to become a member of CDI, please email


12:20p  Working Group Reports

  • Citizen Science - Dave Govoni
  • Data Management - Heather Henkel and Viv Hutchison
  • Earth-Science Themes - Roland Viger
  • Semantic Web  - Fran Lightsom and Janice Gordon
  • Tech Stack  - Daniella Birch and Roland Viger
  • Connected Devices - Tim Kern and Lance Everette

12:30p  Adjourn

Presentation Q/A Notes

Rex Sanders: I have a question for Sonaar. One of the issues locally, this is not a USGS issue, but one of the issues locally is that it takes forever to get coliform counts from the ocean water so that they say, “Oh, well, the coliform counts a week ago were too high so you shouldn’t swim in the ocean today.” How will that work?

Sonaar: Yes, That is one of the things that we learned after we had begun the project. Most of the time when you get a beach closing, what you really being told is a week ago things were really bad and we actually don’t know what it is like today. And that beach can really only reopen after test results come back in that clear the water source as being safe. Which is kind of maddening because it means that, when it comes to environmental illness, more often than not, people are the canaries in the coal mine. Whether it’s the Legionella outbreak in New York or deeper problems that are faced in the developing world...more systematic ones, it’s a big, big issue.  We are not immediately focused on coliform, but we have the capabilities to provide the world with a source that can do that in real time. We have found that one of the issues, especially when we met with African government leaders, is that some of this data is politically...there are a number of challenges with releasing it and there is a lot of push back. One of the things we found is that if you are going to do that, you need to know exactly what you’re doing. You need to be an expert at deploying your technology and know every single aspect of what could go wrong. That really led us to take a few steps back and look at, what can we do that would not be life or death to begin with. because there are going to be a lot of liabilities and the list of challenges that we are addressing is only going to expand as we get deeper into developing solutions.  What we have developed is really a contamination roadmap. We are starting with nutrients because of the widespread pervasive problems affecting our oceans.  And there is a lot of potential, forgive me if this sounds blunt but, “bang for the buck.” And focusing on that because it is a situation where not just the public, but the private sector has a lot of interest in finding a solution. I mean, farmers don’t like when fertilizer just washes off their field. That costs them money. We plan to move very quickly from nutrients into things like volatiles, pesticides, heavy metals and eventually microbiologicals. But we want to make sure that data integrity is the thing that drives us every step of the way.  We think that focusing on areas where there is existing equipment that will help to validate us is the most responsible way to begin. That is going to give us the credibility and the trust that we need to take on more.

Rex Sanders: Well, that brings up a question for either Jennifer or Mathew. Have you have that kind of political pushback on the sensitivity of the data that you are trying to collect? I know you touched on it a little in your discussions about making sure you can trust the data that you are collecting.  What has your experience been when you go into a community and say, “Oh wait, things are really bad here”?

Mathew: We wait for communities to invite us in. It is a very tricky thing to show someone the output of a sensor and then say, “Well, we don’t really know what to do with this.” There is a growing citizen science or citizen sensing projects that involve very speculative devices, including our own. We have a lot of speculative devices and we try to draw a clear line between when we are working with things that are in development and involve the community in that development without making any promises about what we can do going forward. We also try to start our monitoring projects from a regulatory outcomes perspective.  So before we are collecting data, we have a community meeting that is about what those data are going to be used for.  If we are going to collect data for water quality, then what are relevant EPA regulations? Who’s in charge?. Is it worth collecting some kind of quantifiable turbidity data or maybe we should just just go out and take pictures and send them to the local health inspector? Maybe that will be more effective.  We try to match the data to the actual outcome that the community is seeking before we collect the data as opposed to collecting the a wealth of data and then sitting on it wondering what we can do with it. . It can be very disempowering if a community spends a lot of time collecting data and no one thought about who they were going to talk to at a regulatory agency before hand. Suddenly, they turn around and they have all of this suggestive data that doesn’t meet any of the standards to get them anything that they want.

Jennifer: So from my perspective, it is quite interesting because we began in part by looking at regulations coming out of the EPA and they suggest that a lot of these technologies are giving rise to a lot of exploratory ways for people to monitor environments that don’t exactly match up to existing regulatory infrastructure. What has been interesting is to look at this across the spectrum and that it’s not just a matter of matching up to existing regulation because, in the case of natural gas extraction, we know that regulation is potentially a bit thin on the ground. So there is that issue. So, how might new types of citizen science data challenge or expand regulations. We don’t see it as a one way issue of just trying to match what the regulation is asking for, but to look at the rise of these technologies and the practices and questions that they are giving rise to and new kinds of data and findings about environmental pollution that might not necessarily have been accounted for. So these are new types of data practices that might in some cases match up with existing regulatory frameworks, but in other cases might very well challenge them.  So in response to your question, yes we have had pushback from state and Federal agencies, but also a lot of questions about what is this data? How can we begin to deal with it, to work with it?