Confluence Retirement

Due to the feedback from stakeholders and our commitment to not adversely impact USGS science activities that Confluence supports, we are extending the migration deadline to January 2023.

In an effort to consolidate USGS hosted Wikis, myUSGS’ Confluence service is targeted for retirement. The official USGS Wiki and collaboration space is now SharePoint. Please migrate existing spaces and content to the SharePoint platform and remove it from Confluence at your earliest convenience. If you need any additional information or have any concerns about this change, please contact myusgs@usgs.gov. Thank you for your prompt attention to this matter.
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 54 Next »

Scientific Computing Topics

Scientific Computing Environments

Statistical environments

S-Plus

USGS holds a license for a version of the S-Plus statistical package, and the USGS internal distribution includes USGS-developed statistical and graphics tools.

R

R is an open-source statistical analysis system built to be functionally equivalent to S. It is gaining in popularity, and has a body of GUI's (such as RCommander) and interfaces available for it. Since R has a command-line interface, it is fairly easy to connect with other software, for example, ArcGIS. and Java Python ("Jython"). Some have described R as a "statistical scripting language."

Although the user community is very good at answering questions, the volume of questions and answers may be overwhelming.

Python

Python, an interpreted, interactive, object-oriented, extensible programming language, supported on numerous computing platforms. In addition to being relatively easy to use and good for numerical analysis and web programming, it is the de facto scripting language for our corporate GIS platform, ArcGIS.

  • The starting point for all things Python is: http://python.org
  • There are a large number of instructional videos available, follow this link for an example.
  • Python is widely used outside the USGS, so there are lots of great resources hosted in the wide world. We should really participate there, instead of building our own microverse. See the Python Forum

Packages for Python

Generally useful stuff

  • NumPy is a really important add-on for Python. It provide low-level functions for handling arrays. It comes with ArcGIS
  • SciPy is another really important add-on for Python. It's built on top of NumPy and provides higher-level (i.e., more user-friendly) functionality. It also comes with ArcGIS.
  • Python is distributed with a large standard library of modules that support various tasks, but many more are available online. An extensive collection of pre-compiled libraries are available in this collection posted by Christoph Gohlke. Key libraries of interest to scientific computing include NumPy, SciPy, matplotlib, and netCDF4.
  • Versions of the GDAL and OGR libraries are now available in Python, in a package called pypi.
  • Using Python with Fortran or C sub-page

In addition to a truly dizzying number of individual add-on libraries for Python, there are a few distributions of sets of python libraries that try to eliminate the hassle of pulling together lots of libraries. We should investigate these!

More obscure/advanced stuff

  • Integrated Development Environments (IDEs) - There are lots of choices, many of which allow a user to write in more than one language. See the sub-page on this.
  • NetworkX - a pretty hard-core library for making complex networks and graphs. Could also be useful for TINs.

Python and ArcGIS

  • Python, aside from being a standalone general scripting language, has become the main scripting language for the ArcGIS platform. Versions of Python 2.x and Python libraries that are included in different versions of ArcGIS are as follows:

    ArcGIS

    9.3

    10.0

    10.1 (beta)

    Python

    2.5.1

    2.6.2

    2.7

    NumPy

    1.0.3

    1.3

    1.5

    matplotlib

    1.0.1

  • The netCDF4 module compiled for ArcGIS 10.0, 10.1 allows fairly straightforward access of netCDF and OPeNDAP data from ArcGIS Python scripts. Thanks to Rich Signell and, most of all, Christoph Gohlke (who compiled the module so it will work with Arc). Rich and Curtis Price provided this python script and script tools (zipfile).
    • The image below shows an example of a raster that has been loaded into ArcMap from a remote dataset using a Python script tool that accesses data using the netCDF4 library and the OPeNDAP access protocol. (Click it for a full-resolution view.)

Discussion topics

MATLAB

MATLAB is commonly used for data and compute-intensive scientific analysis.

Known USGS MATLAB users: Rich Signell, Ashley Van Beusekom

Microsoft Office

Although Microsoft Office is very useful for general-purpose computing widely used in science, it has also been also widely criticized by the scientific community (especially by statisticians). The largest problem by far is data import/export, and the misuse of the tools, for example the (far too common) use of Excel as a database, and errors in worksheet cell references.

USGS holds a site license for MS Office, through the Bureau Windows Technical Support Team (BWTST).

Geographic Information Systems (GIS)

Since much of USGS scientific computing involves spatial data, it is no suprise that more than half of the attendees of the 2011 CDI meeting were polled identified themselves as users of Esri's ArcGIS product.

USGS Core Science Systems supports the Enterprise GIS (EGIS) team, who supports GIS activities in the Bureau. EGIS supports USGS-wide site licenses for Esri's ArcGIS suite, and Global Mapper.



Contributors

UserEditsCommentsLabels
Viger, Roland 4905
Signell, Richard P. 600
Benson, Abigail L. 100
Blodgett, David L. 100
  • No labels