Confluence Retirement

In an effort to consolidate USGS hosted Wikis, the myUSGS Confluence service is targeted for retirement on January 28, 2022. The official USGS Wiki and collaboration space is now SharePoint. Please migrate existing spaces and content to the SharePoint platform and remove it from Confluence at your earliest convenience. If you need any additional information or have any concerns about this change, please contact myusgs@usgs.gov. Thank you for your prompt attention to this matter.


Quick Links

Child pages
  • Data Review Guidance
Skip to end of metadata
Go to start of metadata

Table of Contents

Goals

The goals of data review are to ensure that:

  1. ALL of the data stored in BioData are as complete and accurate as possible
  2. ALL of the data that should be released to the public are released to the public in a timely manner
  3. ALL of the data that should not be released to the public (internal-use only and proprietary data) will not be released to the public

Data Release Rules

BioData places every sample in one of three categories

  • Public
    • Anyone can view or retrieve the data
  • Project Staff
    • Only Project Staff, National, and Water Science Center (WSC) Data Stewards can view or retrieve the data
    • Project Staff can see only data for projects to which they are assigned
  • Data Steward
    • Only National and WSC Data Stewards can view or retrieve the data
    • WSC Data Stewards can see ALL data for ALL projects assigned to their science center, but have only Public rights for data of other WSCs

The BioData GUI uses icons for these categories

  • = Public
  • = Project Staff
  • = Data Steward

From a practical, operational perspective:

  • If all of your samples should be released to the public then when you are done reviewing your data all of the dots in the sample list (physical or community samples tabs) need to be green.

Sample data is categorized "Public" when all of the following are true...

  • Sample data passes all of the automated BioData data validation tests
  • The Project Profile has been reviewed and accepted
  • The project Study Reach for the sample has been reviewed and accepted
  • The Sample collection information has been reviewed and accepted (on sample data entry screen)
  • The Sample is flagged as "unrestricted" using the Analysis Status Code in the sample header (on sample data entry screen)
  • Taxon id/count records have been input for community (invert, fish, algae) samples
  • All of the taxon id/count records have been reviewed and accepted (in Review Taxa Records tab)

Sample data is categorized "Project Staff" when...

  • any of the above items have not been reviewed and accepted
  • the sample is flagged as "internal" or "proprietary" using the Analysis Status Code in the sample data entry screen information header.

Sample data is categorized "Data Steward" when...

  • BioData's automated data-checking routines detect invalid or erroneous data

Recommended Steps

Step 1 - Review Your Project Profile

None of the samples for your project will be available to anyone but the project staff until the Project Profile has been reviewed and accepted. So get this out of the way before you worry about reviewing individual samples.

  • fix any project profile validation errors flagged by system
  • enter any missing data
  • correct any typo's
  • review the project abstract - this is published as a description of the project. Make sure it's good.
  • When you are done set the Review Status Code to "Reviewed and Accepted".

To open the project profile click on the "Edit" link next to the Project drop-down box at the top of the page. Note that the "Edit" link is only displayed to Project Staff.

Step 2 - Review Your Study Reaches

If you've assigned your samples to study reaches sample data will be released only if the study reach location information has been reviewed and accepted.

Select the Study Reaches tab

  • fix any validation errors flagged by system
  • enter any missing data
  • correct any typo's
  • set Review Status Code to "Reviewed and Accepted"

If the project profile and study reach is good to go there should be a green icon next to it in the study reach list. If it's not, click on it to see what the problem is.

Step 3 - Review Your Physical and Community Samples

For each physical or community sample you need to:

  • fix any validation errors flagged by system
  • enter any missing data
  • correct any typo's
  • set Review Status Code to "Reviewed and Accepted"
  • Set Analysis Status Code to "U - unrestricted" for samples that will be released to public. Samples set to "I - Internal" and "P - Proprietary" will be released only to project staff.

You can find the review status of your samples in either of two places:

  1. In the sample lists under the Physical or Community Samples tabs
  2. In the Data Release Status tab
    1. NOTE: option 1 may be your only choice for larger projects as this feature may be very slow or not work at all with large projects
  3. By retrieving the sample inventory from the BioData Retrieval website.
    1. NOTE: you need to log in to BioData and then go to the internal retrieval page to get non-public samples listed in the sample inventory

Using the Physical or Community Sample List to Determine Review Status

Select the Physical Samples or Community Samples tab. This displays a list of samples.

  • The Review Status column indicates whether the sample has been reviewed and accepted.
  • The Release Category column indicates whether the sample is being released to the public, to project staff, or only to data stewards. Click on the icon to display a popup window showing which factors are causing a sample to be withheld from the public.

Using the Sample Data Entry Screen to Determine Review Status

The sample data entry screens include fields for setting whether the sample can be released to the public (Analysis Status Code), and whether the sample information has been reviewed and accepted (Review Status Code).

The sample data entry screen also indicates the Release Category of the sample (Public, Project Staff, Data Steward). Clicking on the icon will display a popup that shows why a sample is not being released to the public (where you can click on the "Fix It" button to jump to the screen where the problem is).

 

Using the Data Release Status Tab to Determine Review Status

Select the Data Release Status tab to determine which samples need work.

  • NOTE: this feature may be very slow or not work at all on larger projects

The summary section displays pie charts that summarize the overall release status of project samples. (info) Clicking on pie titles or slices will filter the sample list below.

The sample list in this tab indicates which samples will be released to the public, and what factors are preventing any particular sample from being released. It can be filtered to display a subset of all of the samples entered.

Step 4 - Review Your Taxonomic Data Records

Look for these issues in taxonomic data

1) For invertebrates and diatoms, confirm that minimum target counts were achieved (e.g. 300 for invertebrates or 600 for diatoms). Tools in BioData help identify minimum target counts. If targets were not met, confirm that lab notes describe condition of the sample. For example, the entire invertebrate sample might have been sorted without reaching 300 organisms, or the 8 hour time limit specified in OFR 00-212 might have been reached. In another example, diatoms on slide might have been obscured by fine silts and clays. This information can be used by USGS biologist to confirm that sampling is successful, and if not, how to correct for future field collections.

2) For invertebrates, confirm that organisms were identified to target levels (e.g. OFR 00-212). Tools in BioData help identify minimum target counts. If targets were not met, confirm that lab notes document the shortfall. For example, a significant number of immature organisms will prevent identification to target level. This information can be used to consider shifting the sampling period to obtain mature individuals in future field collections.

3) Determine if there are obvious differences between current and previous taxonomic data from the site. For example, changes in the taxonomic level to which organisms are identified, appearance of new taxa, absence of formerly common taxa, or significant changes in relative abundance of taxonomic groups should alert the analyst to look further at the data, ask the lab about analyses, or contact one of the BioData taxonomic stewards for assistance.

4) Consider comparing quantitative (e.g. IRTH) and qualitative (e.g. IQMH) results, if applicable.  Note obvious differences and determine if there are differences in collection procedures or personnel. This information can be used by USGS biologist to flag data, ask questions, alter procedures, or contact one of the BioData taxonomic stewards for assistance.

Process for checking taxonomic data

1) Select the sample that you want to review

2) Set the review code for the taxonomic records

 

STEP 1: Select the sample that you want to review

  1. Apply filters to the display to show the samples that you want to select from.
    1. Only one community will be displayed at a time
  2. Consult the Taxonomic Data Review Icons page to understand what the icons mean
  3. Click on the Display link to open that set of records

 

STEP 2: Set their review code for the taxonomic records

  • You can use the column sort and resizing features to examine your data and look for outliers
  • The left hand summary panel can be hidden
  • You need to set the review code for every record
  • Beware- if your data set is large it may be spread across several pages (note "Page Size" and "Page x of y" below)
    • the 'select all' check box may not select all the records
    • you might want to consider using the "Set ALL Results to:" option at the bottom of the page.
  • Return to the sample/lab order list and make sure the Taxonomic Data Review Icon is filled in appropriately
    • If it's not, you missed some records or made some errors.

Review Status definitions

    • Awaiting Review: Default state given every result record when it is created.
    • Reviewed and Accepted: Reviewer has reviewed and accepted the result.
    • Reviewed and Rejected: Reviewer has rejected the record. One rejection in a result set means that none of the results in the result set will be published to BioData Retrieval.
    • Presumed Satisfactory: Result is neither accepted or rejected, but will not inhibit the transfer of the result set to BioData Retrieval.

Fish Condition Factor - sort to identify typos
The Fish Condition Factor is provided to identify typos only, and is not intended for weight-length relationship analysis. Sort by the Fish Condition Factor to identify typos in fish total length and weight data entry. Pay special attention to the highest values (especially those over 10) and lowest values (those less than 0.6).

BioData uses the following calculation for condition factor:

Source: Williams, J.E., 2000, Chapter 13, The Coefficient of Condition of Fish, in Schneider, James C., ed., Manual of fisheries survey methods II: with periodic updates: Michigan Department of Natural Resources, Fisheries Special Report 25, Ann Arbor.

 

Tips and Tricks

Use the Data Retrievals

  • No labels