# usgs-data-download > Download water level data from USGS using the dataretrieval package. Use when accessing real-time or historical streamflow data, downloading gage height or discharge measurements, or working with USGS station IDs. - Author: JierunChen - Repository: benchflow-ai/skillsbench - Version: 20260122170733 - Stars: 501 - Forks: 0 - Last Updated: 2026-02-06 - Source: https://github.com/benchflow-ai/skillsbench - Web: https://mule.run/skillshub/@@benchflow-ai/skillsbench~usgs-data-download:20260122170733 --- --- name: usgs-data-download description: Download water level data from USGS using the dataretrieval package. Use when accessing real-time or historical streamflow data, downloading gage height or discharge measurements, or working with USGS station IDs. license: MIT --- # USGS Data Download Guide ## Overview This guide covers downloading water level data from USGS using the `dataretrieval` Python package. USGS maintains thousands of stream gages across the United States that record water levels at 15-minute intervals. ## Installation ```bash pip install dataretrieval ``` ## nwis Module (Recommended) The NWIS module is reliable and straightforward for accessing gage height data. ```python from dataretrieval import nwis # Get instantaneous values (15-min intervals) df, meta = nwis.get_iv( sites='', start='', end='', parameterCd='00065' ) # Get daily values df, meta = nwis.get_dv( sites='', start='', end='', parameterCd='00060' ) # Get site information info, meta = nwis.get_info(sites='') ``` ## Parameter Codes | Code | Parameter | Unit | Description | |------|-----------|------|-------------| | `00065` | Gage height | feet | Water level above datum | | `00060` | Discharge | cfs | Streamflow volume | ## nwis Module Functions | Function | Description | Data Frequency | |----------|-------------|----------------| | `nwis.get_iv()` | Instantaneous values | ~15 minutes | | `nwis.get_dv()` | Daily values | Daily | | `nwis.get_info()` | Site information | N/A | | `nwis.get_stats()` | Statistical summaries | N/A | | `nwis.get_peaks()` | Annual peak discharge | Annual | ## Returned DataFrame Structure The DataFrame has a datetime index and these columns: | Column | Description | |--------|-------------| | `site_no` | Station ID | | `00065` | Water level value | | `00065_cd` | Quality code (can ignore) | ## Downloading Multiple Stations ```python from dataretrieval import nwis station_ids = ['', '', ''] all_data = {} for site_id in station_ids: try: df, meta = nwis.get_iv( sites=site_id, start='', end='', parameterCd='00065' ) if len(df) > 0: all_data[site_id] = df except Exception as e: print(f"Failed to download {site_id}: {e}") print(f"Successfully downloaded: {len(all_data)} stations") ``` ## Extracting the Value Column ```python # Find the gage height column (excludes quality code column) gage_col = [c for c in df.columns if '00065' in str(c) and '_cd' not in str(c)] if gage_col: water_levels = df[gage_col[0]] print(water_levels.head()) ``` ## Common Issues | Issue | Cause | Solution | |-------|-------|----------| | Empty DataFrame | Station has no data for date range | Try different dates or use `get_iv()` | | `get_dv()` returns empty | No daily gage height data | Use `get_iv()` and aggregate | | Connection error | Network issue | Wrap in try/except, retry | | Rate limited | Too many requests | Add delays between requests | ## Best Practices - Always wrap API calls in try/except for failed downloads - Check `len(df) > 0` before processing - Station IDs are 8-digit strings with leading zeros (e.g., '04119000') - Use `get_iv()` for gage height, as daily data is often unavailable - Filter columns to exclude quality code columns (`_cd`) - Break up large requests into smaller time periods to avoid timeouts