ulmo Readers¶

ulmo readers / api’s.

note on dates and times¶

Dates and times can provided a few different ways, depending on what is convenient. They can either be a string representation or as instances of date and datetime objects from python’s datetime standard library module. For strings, the ISO 8061 format (‘YYYY-mm-dd HH:MM:SS’ or some abbreviated version) is accepted, as well dates in ‘mm/dd/YYYY’ format.

Readers for Global to USA-national data¶

Climate Prediction Center (CPC) Weekly Drought¶

Climate Prediction Center Weekly Drought Index dataset

ulmo.cpc.drought.get_data(state=None, climate_division=None, start=None, end=None, as_dataframe=False)¶

Retreives data.

Parameters:	state (`None` or str) – If specified, results will be limited to the state corresponding to the given 2-character state code. climate_division (`None` or int) – If specified, results will be limited to the climate division. start (`None` or date (see note on dates and times)) – Results will be limited to those after the given date. Default is the start of the current calendar year. end (`None` or date (see note on dates and times)) – If specified, results will be limited to data before this date. as_dataframe (bool) – If `False` (default), a dict with a nested set of dicts will be returned with data indexed by state, then climate division. If `True` then a pandas.DataFrame object will be returned. The pandas dataframe is used internally, so setting this to `True` is a little bit faster as it skips a serialization step.
Returns:	data – A dict or pandas.DataFrame representing the data. See the `as_dataframe` parameter for more.
Return type:	dict or pandas.Dataframe

CUAHSI Hydrologic Information System (HIS)¶

CUAHSI HIS Central¶

CUAHSI HIS Central catalog web services

ulmo.cuahsi.his_central.get_services(bbox=None, user_cache=False)¶

Retrieves a list of services.

Parameters:	bbox (`None` or 4-tuple) – Optional argument for a bounding box that covers the area you want to look for services in. This should be a tuple containing (min_longitude, min_latitude, max_longitude, and max_latitude) with these values in decimal degrees. If not provided then the full set of services will be queried from HIS Central. user_cache (bool) – If False (default), use the system temp location to store cache WSDL and other files. Use the default user ulmo directory if True.
Returns:	services_dicts – A list of dicts that each contain information on an individual service.
Return type:	list

CUAHSI WaterOneFlow (WOF)¶

CUAHSI WaterOneFlow (WOF) web data access services. These services provide access to a wide variety of data sources that use the standardized WOF service protocol. Most such services are registered with the CUAHSI HIS Central catalog and can be identified via queries using the ulmo.cuahsi.his_central.get_services catalog web service. Each WOF service may have some unique characteristics, such as specific regional and temporal domains, set of variables, or additional constraints. The notes below provides additional usage details for some data sources.

NRCS SNOTEL: USDA Natural Resources Conservation Service (NRCS) Snow Telemetry network of remote, high-elevation mountain sites in the western U.S., used to monitor snowpack, precipitation, temperature and other climatic conditions. Timestamps in the request and data response are in PST (UTC-8).

ulmo.cuahsi.wof.get_sites(wsdl_url, suds_cache=('default', ), timeout=None, user_cache=False)¶

Retrieves information on the sites that are available from a WaterOneFlow service using a GetSites request. For more detailed information including which variables and time periods are available for a given site, use get_site_info().

Parameters:	wsdl_url (str) – URL of a service’s web service definition language (WSDL) description. All WaterOneFlow services publish a WSDL description and this url is the entry point to the service. suds_cache (None or tuple) – SOAP local cache duration for WSDL description and client object. Pass a cache duration tuple like (‘days’, 3) to set a custom duration. Duration may be in months, weeks, days, hours, or seconds. If unspecified, the default duration (1 day) will be used. Use `None` to turn off caching. timeout (int or float) – suds SOAP URL open timeout (seconds). If unspecified, the suds default (90 seconds) will be used. user_cache (bool) – If False (default), use the system temp location to store cache WSDL and other files. Use the default user ulmo directory if True.
Returns:	sites_dict – a python dict with site codes mapped to site information
Return type:	dict

ulmo.cuahsi.wof.get_site_info(wsdl_url, site_code, suds_cache=('default', ), timeout=None, user_cache=False)¶

Retrieves detailed site information from a WaterOneFlow service using a GetSiteInfo request.

Parameters:	wsdl_url (str) – URL of a service’s web service definition language (WSDL) description. All WaterOneFlow services publish a WSDL description and this url is the entry point to the service. site_code (str) – Site code of the site you’d like to get more information for. Site codes MUST contain the network and be of the form <network>:<site_code>, as is required by WaterOneFlow. suds_cache (`None` or tuple) – SOAP local cache duration for WSDL description and client object. Pass a cache duration tuple like (‘days’, 3) to set a custom duration. Duration may be in months, weeks, days, hours, or seconds. If unspecified, the default duration (1 day) will be used. Use `None` to turn off caching. timeout (int or float) – suds SOAP URL open timeout (seconds). If unspecified, the suds default (90 seconds) will be used. user_cache (bool) – If False (default), use the system temp location to store cache WSDL and other files. Use the default user ulmo directory if True.
Returns:	site_info – a python dict containing site information
Return type:	dict

ulmo.cuahsi.wof.get_values(wsdl_url, site_code, variable_code, start=None, end=None, suds_cache=('default', ), timeout=None, user_cache=False)¶

Retrieves site values from a WaterOneFlow service using a GetValues request.

Parameters:	wsdl_url (str) – URL of a service’s web service definition language (WSDL) description. All WaterOneFlow services publish a WSDL description and this url is the entry point to the service. site_code (str) – Site code of the site you’d like to get values for. Site codes MUST contain the network and be of the form <network>:<site_code>, as is required by WaterOneFlow. variable_code (str) – Variable code of the variable you’d like to get values for. Variable codes MUST contain the network and be of the form <vocabulary>:<variable_code>, as is required by WaterOneFlow. start (`None` or datetime (see note on dates and times)) – Start of the query datetime range. If omitted, data from the start of the time series to the `end` timestamp will be returned (but see caveat, in note below). end (`None` or datetime (see note on dates and times)) – End of the query datetime range. If omitted, data from the `start` timestamp to end of the time series will be returned (but see caveat, in note below). suds_cache (`None` or tuple) – SOAP local cache duration for WSDL description and client object. Pass a cache duration tuple like (‘days’, 3) to set a custom duration. Duration may be in months, weeks, days, hours, or seconds. If unspecified, the default duration (1 day) will be used. Use `None` to turn off caching. timeout (int or float) – suds SOAP URL open timeout (seconds). If unspecified, the suds default (90 seconds) will be used. user_cache (bool) – If False (default), use the system temp location to store cache WSDL and other files. Use the default user ulmo directory if True.
Returns:	site_values – a python dict containing values
Return type:	dict

Notes

If both start and end parameters are omitted, the entire time series available will typically be returned. However, some service providers will return an error if either start or end are omitted; this is specially true for services hosted or redirected by CUAHSI via the CUAHSI HydroPortal, which have a ‘WSDL’ url using the domain https://hydroportal.cuahsi.org. For HydroPortal, a start datetime of ‘1753-01-01’ has been known to return valid results while catching the oldest start times, though the response may be broken up into chunks (‘paged’).

ulmo.cuahsi.wof.get_variable_info(wsdl_url, variable_code=None, suds_cache=('default', ), timeout=None, user_cache=False)¶

Retrieves site values from a WaterOneFlow service using a GetVariableInfo request.

Parameters:	wsdl_url (str) – URL of a service’s web service definition language (WSDL) description. All WaterOneFlow services publish a WSDL description and this url is the entry point to the service. variable_code (None or str) – If None (default) then information on all variables will be returned, otherwise, this should be set to the variable code of the variable you’d like to get more information on. Variable codes MUST contain the network and be of the form <vocabulary>:<variable_code>, as is required by WaterOneFlow. suds_cache (`None` or tuple) – SOAP local cache duration for WSDL description and client object. Pass a cache duration tuple like (‘days’, 3) to set a custom duration. Duration may be in months, weeks, days, hours, or seconds. If unspecified, the default duration (1 day) will be used. Use `None` to turn off caching. timeout (int or float) – suds SOAP URL open timeout (seconds). If unspecified, the suds default (90 seconds) will be used. user_cache (bool) – If False (default), use the system temp location to store cache WSDL and other files. Use the default user ulmo directory if True.
Returns:	variable_info – a python dict containing variable information. If no variable code is None (default) then this will be a nested set of dicts keyed by <vocabulary>:<variable_code>
Return type:	dict

NASA ORNL Daymet weather data services¶

NASA EARTHDATA ORNL DAAC Daymet web services

ulmo.nasa.daymet.get_variables()¶

retrieve a list of variables available

Parameters:	None
Returns:	dictionary of variables with variable abbreviations as keys and description as values

ulmo.nasa.daymet.get_daymet_singlepixel(latitude, longitude, variables=['tmax', 'tmin', 'prcp'], years=None, as_dataframe=True)¶

Fetches a time series of climate variables from the DAYMET single pixel extraction

Parameters:	latitude (float) – The latitude (WGS84), value between 52.0 and 14.5. longitude (float) – The longitude (WGS84), value between -131.0 and -53.0. variables (list of str) – Daymet parameters to fetch. default = [‘tmax’, ‘tmin’, ‘prcp’]. Available options: ‘tmax’: maximum temperature ‘tmin’: minimum temperature ‘srad’: shortwave radiation ‘vp’: vapor pressure ‘swe’: snow-water equivalent ‘prcp’: precipitation; ‘dayl’ : daylength. years (list of int) – List of years to return. Daymet version 2 available 1980 to the latest full calendar year. If `None` (default), all years will be returned as_dataframe (`True` (default) or `False`) – if `True` return pandas dataframe if `False` return open file with contents in csv format
Returns:	single_pixel_timeseries
Return type:	pandas dataframe or csv filename

National Climatic Data Center (NCDC)¶

NCDC Climate Index Reference Sequential (CIRS)¶

National Climatic Data Center Climate Index Reference Sequential (CIRS) drought dataset

ulmo.ncdc.cirs.get_data(elements=None, by_state=False, location_names='abbr', as_dataframe=False, use_file=None)¶

Retrieves data.

Parameters:	elements (`None`, str or list) – The element(s) for which to get data for. If `None` (default), then all elements are used. An individual element is a string, but a list or tuple of them can be used to specify a set of elements. Elements are: ‘cddc’: Cooling Degree Days ‘hddc’: Heating Degree Days ‘pcpn’: Precipitation ‘pdsi’: Palmer Drought Severity Index ‘phdi’: Palmer Hydrological Drought Index ‘pmdi’: Modified Palmer Drought Severity Index ‘sp01’: 1-month Standardized Precipitation Index ‘sp02’: 2-month Standardized Precipitation Index ‘sp03’: 3-month Standardized Precipitation Index ‘sp06’: 6-month Standardized Precipitation Index ‘sp09’: 9-month Standardized Precipitation Index ‘sp12’: 12-month Standardized Precipitation Index ‘sp24’: 24-month Standardized Precipitation Index ‘tmpc’: Temperature ‘zndx’: ZNDX by_state (bool) – If False (default), divisional data will be retrieved. If True, then regional data will be retrieved. location_names (str or `None`) – This parameter defines what (if any) type of names will be added to the values. If set to ‘abbr’ (default), then abbreviated location names will be used. If ‘full’, then full location names will be used. If set to None, then no location name will be added and the only identifier will be the location_codes (this is the most memory-conservative option). as_dataframe (bool) – If `False` (default), a list of values dicts is returned. If `True`, a dict with element codes mapped to equivalent pandas.DataFrame objects will be returned. The pandas dataframe is used internally, so setting this to `True` is faster as it skips a somewhat expensive serialization step. use_file (`None`, file-like object or str) – If `None` (default), then data will be automatically retrieved from the web. If a file-like object or a file path string, then the file will be used to read data from. This is intended to be used for reading in previously-downloaded versions of the dataset.
Returns:	data – A list of value dicts or a pandas.DataFrame containing data. See the `as_dataframe` parameter for more.
Return type:	list or pandas.DataFrame

NCDC Global Historical Climate Network (GHCN) Daily¶

National Climatic Data Center Global Historical Climate Network - Daily dataset

ulmo.ncdc.ghcn_daily.get_data(station_id, elements=None, update=True, as_dataframe=False)¶

Retrieves data for a given station.

Parameters:	station_id (str) – Station ID to retrieve data for. elements (`None`, str, or list of str) – If specified, limits the query to given element code(s). update (bool) – If `True` (default), new data files will be downloaded if they are newer than any previously cached files. If `False`, then previously downloaded files will be used and new files will only be downloaded if there is not a previously downloaded file for a given station. as_dataframe (bool) – If `False` (default), a dict with element codes mapped to value dicts is returned. If `True`, a dict with element codes mapped to equivalent pandas.DataFrame objects will be returned. The pandas dataframe is used internally, so setting this to `True` is a little bit faster as it skips a serialization step.
Returns:	site_dict – A dict with element codes as keys, mapped to collections of values. See the `as_dataframe` parameter for more.
Return type:	dict

ulmo.ncdc.ghcn_daily.get_stations(country=None, state=None, elements=None, start_year=None, end_year=None, update=True, as_dataframe=False)¶

Retrieves station information, optionally limited to specific parameters.

Parameters:	country (str) – The country code to use to limit station results. If set to `None` (default), then stations from all countries are returned. state (str) – The state code to use to limit station results. If set to `None` (default), then stations from all states are returned. elements (`None`, str, or list of str) – If specified, station results will be limited to the given element codes and only stations that have data for any these elements will be returned. start_year (int) – If specified, station results will be limited to contain only stations that have data after this year. Can be combined with the `end_year` argument to get stations with data within a range of years. end_year (int) – If specified, station results will be limited to contain only stations that have data before this year. Can be combined with the `start_year` argument to get stations with data within a range of years. update (bool) – If `True` (default), new data files will be downloaded if they are newer than any previously cached files. If `False`, then previously downloaded files will be used and new files will only be downloaded if there is not a previously downloaded file for a given station. as_dataframe (bool) – If `False` (default), a dict with station IDs keyed to station dicts is returned. If `True`, a single pandas.DataFrame object will be returned. The pandas dataframe is used internally, so setting this to `True` is a little bit faster as it skips a serialization step.
Returns:	stations_dict – A dict or pandas.DataFrame representing station information for stations matching the arguments. See the `as_dataframe` parameter for more.
Return type:	dict or pandas.DataFrame

NCDC Global Summary of the Day (GSoD)¶

National Climatic Data Center Global Summary of the Day dataset

ulmo.ncdc.gsod.get_data(station_codes, start=None, end=None, parameters=None)¶

Retrieves data for a set of stations.

Parameters:	station_codes (str or list) – Single station code or iterable of station codes to retrieve data for. start (`None` or date (see note on dates and times)) – If specified, data are limited to values after this date. end (`None` or date (see note on dates and times)) – If specified, data are limited to values before this date. parameters (`None`, str or list) – If specified, data are limited to this set of parameter codes.
Returns:	data_dict – Dict with station codes keyed to lists of value dicts.
Return type:	dict

ulmo.ncdc.gsod.get_stations(country=None, state=None, start=None, end=None, update=True)¶

Retrieve information on the set of available stations.

Parameters:	country ({`None`, str, or iterable}) – If specified, results will be limited to stations with matching country codes. state ({`None`, str, or iterable}) – If specified, results will be limited to stations with matching state codes. start (`None` or date (see note on dates and times)) – If specified, results will be limited to stations which have data after this start date. end (`None` or date (see note on dates and times)) – If specified, results will be limited to stations which have data before this end date. update (bool) – If `True` (default), check for a newer copy of the stations file and download if it is newer the previously downloaded copy. If `False`, then a new stations file will only be downloaded if a previously downloaded file cannot be found.
Returns:	stations_dict – A dict with USAF-WBAN codes keyed to station information dicts.
Return type:	dict

NOAA GOES Data Collection System (DCS) services¶

NOAA GOES Data Collection System Access to data stream transmitted via GOES satellite.

ulmo.noaa.goes.get_data(dcp_address, hours, use_cache=False, cache_path=None, as_dataframe=True)¶

Fetches GOES Satellite DCP messages from NOAA Data Collection System (DCS) field test.

Parameters:	dcp_address (str, iterable of strings) – DCP address or list of DCP addresses to be fetched; lists will be joined by a ‘,’. use_cache (bool,) – If True (default) use hdf file to cache data and retrieve new data on subsequent requests cache_path ({`None`, str},) – If `None` use default ulmo location for cached files otherwise use specified path. files are named using dcp_address. as_dataframe (bool) – If True (default) return data in a pandas dataframe otherwise return a dict.
Returns:	message_data – Either a pandas dataframe or a dict indexed by dcp message times
Return type:	{pandas.DataFrame, dict}

ulmo.noaa.goes.decode(dataframe, parser, **kwargs)¶

decodes goes message data in pandas dataframe returned by ulmo.noaa.goes.get_data().

Parameters:	dataframe (pandas.DataFrame) – pandas.DataFrame returned by ulmo.noaa.goes.get_data() parser ({function, str}) – function that acts on dcp_message each row of the dataframe and returns a new dataframe containing several rows of decoded data. This returned dataframe may have different (but derived) timestamps than that the original row. If a string is passed then a matching parser function is looked up from ulmo.noaa.goes.parsers
Returns:	decoded_data – pandas dataframe, the format and parameters in the returned dataframe depend wholly on the parser used
Return type:	pandas.DataFrame

USGS National Water Information System (NWIS)¶

USGS National Water Information System web services

ulmo.usgs.nwis.get_sites(service=None, input_file=None, sites=None, state_code=None, huc=None, bounding_box=None, county_code=None, parameter_code=None, site_type=None, **kwargs)¶

Fetches site information from USGS services. See the USGS Site Service documentation for a detailed description of options. For convenience, major options have been included with pythonic names. At least one major filter must be specified. Options that are not listed below may be provided as extra kwargs (i.e. keyword=’argument’) and will be passed along with the web services request. These extra keywords must match the USGS names exactly. The USGS Site Service website describes available keyword names and argument formats.

Note

Only the options listed below have been tested and you may have mixed results retrieving data with extra options specified. Currently ulmo requests and parses data in the WaterML 1.x format. Some options are not available in this format.

Parameters:	service ({`None`, ‘instantaneous’, ‘iv’, ‘daily’, ‘dv’}) – The service to use, either “instantaneous”, “daily”, or None (default). If set to `None`, then both services are used. The abbreviations “iv” and “dv” can be used for “instantaneous” and “daily”, respectively. input_file (`None`, file path or file object) – If `None` (default), then the NWIS web services will be queried, but if a file is passed then this file will be used instead of requesting data from the NWIS web services. sites (str, iterable of strings or `None`) – A major filter. The site(s) to use; lists will be joined by a ‘,’. At least one major filter must be specified. state_code (str or `None`) – A major filter. Two-letter state code used in `stateCd` parameter. At least one major filter must be specified. county_code (str, iterable of strings or `None`) – A major filter. The 5 digit FIPS county code(s) used in the countyCd parameter; lists will be joined by a ‘,’. At least one major filter must be specified. huc (str, iterable of strings or `None`) – A major filter. The hydrologic unit code(s) to use; lists will be joined by a ‘,’. At least one major filter must be specified. bounding_box (str, iterable of strings or `None`) – A major filter. This bounding box used in the bBox parameter. The format is westernmost longitude, southernmost latitude, easternmost longitude, northernmost latitude; lists will be joined by a ‘,’. At least one major filter must be specified. parameter_code (str, iterable of strings or `None`) – Optional filter. Parameter code(s) that will be passed as the `parameterCd` parameter; lists will be joined by a ‘,’. This parameter represents the following USGS website input: Sites serving parameter codes site_type (str, iterable of strings or `None`) – Optional filter. The type(s) of site used in `siteType` parameter; lists will be joined by a ‘,’.
Returns:	return_sites – a python dict with site codes mapped to site information
Return type:	dict

ulmo.usgs.nwis.get_site_data(site_code, service=None, parameter_code=None, statistic_code=None, start=None, end=None, period=None, modified_since=None, input_file=None, methods=None, **kwargs)¶

Fetches site data.

Parameters:	site_code (str) – The site code of the site you want to query data for. service ({`None`, ‘instantaneous’, ‘iv’, ‘daily’, ‘dv’}) – The service to use, either “instantaneous”, “daily”, or `None` (default). If set to `None`, then both services are used. The abbreviations “iv” and “dv” can be used for “instantaneous” and “daily”, respectively. parameter_code (str) – Parameter code(s) that will be passed as the parameterCd parameter. statistic_code (str) – Statistic code(s) that will be passed as the statCd parameter start (`None` or datetime (see note on dates and times)) – Start of a date range for a query. This parameter is mutually exclusive with period (you cannot use both). It should not be older than 1910-1-1 for ‘iv’ and 1851-1-1 for ‘dv’ services. end (`None` or datetime (see note on dates and times)) – End of a date range for a query. This parameter is mutually exclusive with period (you cannot use both). period ({`None`, str, datetime.timedelta}) – Period of time to use for requesting data. This will be passed along as the period parameter. This can either be ‘all’ to signal that you’d like the entire period of record (down to 1910-1-1 for ‘iv’, 1851-1-1 for ‘dv’), or string in ISO 8601 period format (e.g. ‘P1Y2M21D’ for a period of one year, two months and 21 days) or it can be a datetime.timedelta object representing the period of time. This parameter is mutually exclusive with start/end dates. modified_since (`None` or datetime.timedelta) – Passed along as the modifiedSince parameter. input_file (`None`, file path or file object) – If `None` (default), then the NWIS web services will be queried, but if a file is passed then this file will be used instead of requesting data from the NWIS web services. methods (`None`, str or Python dict) – If `None` (default), it’s assumed that there is a single method for each parameter. This raises an error if more than one method ids are encountered. If str, this is the method id for the requested parameter/s and can use “all” if method ids are not known beforehand. If dict, provide the parameter_code to method id mapping. Parameter’s method id is specific to site.
Returns:	data_dict – a python dict with parameter codes mapped to value dicts
Return type:	dict

ulmo.usgs.nwis.hdf5.get_site(site_code, path=None, complevel=None, complib=None)¶

Fetches previously-cached site information from an hdf5 file.

Parameters:	site_code (str) – The site code of the site you want to get information for. path (`None` or file path) – Path to the hdf5 file to be queried, if `None` then the default path will be used. If a file path is a directory, then multiple hdf5 files will be kept so that file sizes remain small for faster repacking. complevel (`None` or int {0-9}) – Open hdf5 file with this level of compression. If ``None` (default), then a maximum compression level will be used if a compression library can be found. If set to 0 then no compression will be used regardless of what complib is. complib (`None` or str {‘zlib’, ‘bzip2’, ‘lzo’, ‘blosc’}) – Open hdf5 file with this type of compression. If `None` (default) then the best available compression library available on your system will be selected. If complevel argument is set to 0 then no compression will be used.
Returns:	site_dict – a python dict containing site information
Return type:	dict

ulmo.usgs.nwis.hdf5.get_site_data(site_code, agency_code=None, parameter_code=None, path=None, complevel=None, complib=None, start=None)¶

Fetches previously-cached site data from an hdf5 file.

Parameters:	site_code (str) – The site code of the site you want to get data for. agency_code (`None` or str) – The agency code to get data for. This will need to be set if a site code is in use by multiple agencies (this is rare). parameter_code (`None`, str, or list) – List of parameters to read. If `None` (default) read all parameters. Otherwise only read specified parameters. Parameters should be specified with statistic code, i.e. daily streamflow is ‘00060:00003’ path (`None` or file path) – Path to the hdf5 file to be queried, if `None` then the default path will be used. If a file path is a directory, then multiple hdf5 files will be kept so that file sizes remain small for faster repacking. complevel (`None` or int {0-9}) – Open hdf5 file with this level of compression. If ``None` (default), then a maximum compression level will be used if a compression library can be found. If set to 0 then no compression will be used regardless of what complib is. complib (`None` or str {‘zlib’, ‘bzip2’, ‘lzo’, ‘blosc’}) – Open hdf5 file with this type of compression. If `None` (default) then the best available compression library available on your system will be selected. If complevel argument is set to 0 then no compression will be used. start (`None` or string formatted date like 2014-01-01) – Filter the dataset to return only data later that the start date
Returns:	data_dict – a python dict with parameter codes mapped to value dicts
Return type:	dict

ulmo.usgs.nwis.hdf5.get_sites(path=None, complevel=None, complib=None)¶

Fetches previously-cached site information from an hdf5 file.

Parameters:	path (`None` or file path) – Path to the hdf5 file to be queried, if `None` then the default path will be used. If a file path is a directory, then multiple hdf5 files will be kept so that file sizes remain small for faster repacking. complevel (`None` or int {0-9}) – Open hdf5 file with this level of compression. If ``None` (default), then a maximum compression level will be used if a compression library can be found. If set to 0 then no compression will be used regardless of what complib is. complib (`None` or str {‘zlib’, ‘bzip2’, ‘lzo’, ‘blosc’}) – Open hdf5 file with this type of compression. If `None` (default) then the best available compression library available on your system will be selected. If complevel argument is set to 0 then no compression will be used.
Returns:	sites_dict – a python dict with site codes mapped to site information
Return type:	dict

ulmo.usgs.nwis.hdf5.remove_values(site_code, datetime_dicts, path=None, complevel=None, complib=None, autorepack=True)¶

Remove values from hdf5 file.

Parameters:	site_code (str) – The site code of the site to remove records from. datetime_dicts (a python dict with a list of datetimes for a given variable) – (key) to set as NaNs. path (file path to hdf5 file.)
Returns:	None
Return type:	`None`

ulmo.usgs.nwis.hdf5.repack(path, complevel=None, complib=None)¶

Repack the hdf5 file at path. This is the same as running the pytables ptrepack command on the file.

Parameters:	path (file path) – Path to the hdf5 file. complevel (`None` or int {0-9}) – Open hdf5 file with this level of compression. If ``None` (default), then a maximum compression level will be used if a compression library can be found. If set to 0 then no compression will be used regardless of what complib is. complib (`None` or str {‘zlib’, ‘bzip2’, ‘lzo’, ‘blosc’}) – Open hdf5 file with this type of compression. If `None` (default) then the best available compression library available on your system will be selected. If complevel argument is set to 0 then no compression will be used.
Returns:	None
Return type:	`None`

ulmo.usgs.nwis.hdf5.update_site_data(site_code, start=None, end=None, period=None, path=None, methods=None, input_file=None, complevel=None, complib=None, autorepack=True)¶

Update cached site data.

Parameters:	site_code (str) – The site code of the site you want to query data for. start (`None` or datetime (see note on dates and times)) – Start of a date range for a query. This parameter is mutually exclusive with period (you cannot use both). end (`None` or datetime (see note on dates and times)) – End of a date range for a query. This parameter is mutually exclusive with period (you cannot use both). period ({`None`, str, datetime.timedelta}) – Period of time to use for requesting data. This will be passed along as the period parameter. This can either be ‘all’ to signal that you’d like the entire period of record, or string in ISO 8601 period format (e.g. ‘P1Y2M21D’ for a period of one year, two months and 21 days) or it can be a datetime.timedelta object representing the period of time. This parameter is mutually exclusive with start/end dates. path (`None` or file path) – Path to the hdf5 file to be queried, if `None` then the default path will be used. If a file path is a directory, then multiple hdf5 files will be kept so that file sizes remain small for faster repacking. methods (`None`, str or Python dict) – If `None` (default), it’s assumed that there is a single method for each parameter. This raises an error if more than one method ids are encountered. If str, this is the method id for the requested parameter/s and can use “all” if method ids are not known beforehand. If dict, provide the parameter_code to method id mapping. Parameter’s method id is specific to site. input_file (`None`, file path or file object) – If `None` (default), then the NWIS web services will be queried, but if a file is passed then this file will be used instead of requesting data from the NWIS web services. autorepack (bool) – Whether or not to automatically repack the h5 file(s) after updating. There is a tradeoff between performance and disk space here: large files take a longer time to repack but also tend to grow larger faster, the default of True conserves disk space because untamed file growth can become quite destructive. If you set this to False, you can manually repack files with repack().
Returns:	None
Return type:	`None`

ulmo.usgs.nwis.hdf5.update_site_list(sites=None, state_code=None, huc=None, bounding_box=None, county_code=None, parameter_code=None, site_type=None, service=None, input_file=None, complevel=None, complib=None, autorepack=True, path=None, **kwargs)¶

Update cached site information.

See ulmo.usgs.nwis.core.get_sites() for description of regular parameters, only extra parameters used for caching are listed below.

Parameters:	path (`None` or file path) – Path to the hdf5 file to be queried, if `None` then the default path will be used. If a file path is a directory, then multiple hdf5 files will be kept so that file sizes remain small for faster repacking. input_file (`None`, file path or file object) – If `None` (default), then the NWIS web services will be queried, but if a file is passed then this file will be used instead of requesting data from the NWIS web services. complevel (`None` or int {0-9}) – Open hdf5 file with this level of compression. If ``None` (default), then a maximum compression level will be used if a compression library can be found. If set to 0 then no compression will be used regardless of what complib is. complib (`None` or str {‘zlib’, ‘bzip2’, ‘lzo’, ‘blosc’}) – Open hdf5 file with this type of compression. If `None` (default) then the best available compression library available on your system will be selected. If complevel argument is set to 0 then no compression will be used. autorepack (bool) – Whether or not to automatically repack the h5 file after updating. There is a tradeoff between performance and disk space here: large files take a longer time to repack but also tend to grow larger faster, the default of True conserves disk space because untamed file growth can become quite destructive. If you set this to False, you can manually repack files with repack().
Returns:	None
Return type:	`None`

USGS National Elevation Dataset (NED) raster services¶

National Elevation Dataset (NED) services (Raster)

ulmo.usgs.ned.get_available_layers()¶: return list of available data layers

ulmo.usgs.ned.get_raster(layer, bbox, path=None, update_cache=False, check_modified=False, mosaic=False)¶

downloads National Elevation Dataset raster tiles that cover the given bounding box for the specified data layer.

Parameters:	layer (str) – dataset layer name. (see get_available_layers for list) bbox ((sequence of float\|str)) – bounding box of in geographic coordinates of area to download tiles in the format (min longitude, min latitude, max longitude, max latitude) path (`None` or path) – if `None` default path will be used update_cache (`True` or `False` (default)) – if `False` and output file already exists use it. check_modified (`True` or `False` (default)) – if tile exists in path, check if newer file exists online and download if available. mosaic (`True` or `False` (default)) – if `True`, mosaic and clip downloaded tiles to the extents of the bbox provided. Requires rasterio package and GDAL.
Returns:	raster_tiles – metadata as a FeatureCollection. local url of downloaded data is in feature[‘properties’][‘file’]
Return type:	geojson FeatureCollection

ulmo.usgs.ned.get_raster_availability(layer, bbox=None)¶

retrieve metadata for raster tiles that cover the given bounding box for the specified data layer.

Parameters:	layer (str) – dataset layer name. (see get_available_layers for list) bbox ((sequence of float\|str)) – bounding box of in geographic coordinates of area to download tiles in the format (min longitude, min latitude, max longitude, max latitude)
Returns:	metadata – returns metadata including download urls as a FeatureCollection
Return type:	geojson FeatureCollection

Readers for USA regional (sub-national) data¶

California Department of Water Resources Historical Data¶

ulmo.cdec.historical.get_stations()¶

Fetches information on all CDEC sites.

Returns:	df – a pandas DataFrame (indexed on site id) with station information.
Return type:	pandas DataFrame

ulmo.cdec.historical.get_sensors(sensor_id=None)¶

Gets a list of sensor ids as a DataFrame indexed on sensor number. Can be limited by a list of numbers.

Usage example:

from ulmo import cdec
# to get all available sensor info
sensors = cdec.historical.get_sensors()
# or to get just one sensor
sensor = cdec.historical.get_sensors([1])

Parameters:	sites (iterable of integers or `None`)
Returns:	df – a python dict with site codes mapped to site information
Return type:	pandas DataFrame

ulmo.cdec.historical.get_station_sensors(station_ids=None, sensor_ids=None, resolutions=None)¶

Gets available sensors for the given stations, sensor ids and time resolutions. If no station ids are provided, all available stations will be used (this is not recommended, and will probably take a really long time).

The list can be limited by a list of sensor numbers, or time resolutions if you already know what you want. If none of the provided sensors or resolutions are available, an empty DataFrame will be returned for that station.

Usage example:

from ulmo import cdec
# to get all available sensors
available_sensors = cdec.historical.get_station_sensors(['NEW'])

Parameters:	station_ids (iterable of strings or `None`) sensor_ids (iterable of integers or `None`) – check out or use the `get_sensors()` function to see a list of available sensor numbers resolutions (iterable of strings or `None`) – Possible values are ‘event’, ‘hourly’, ‘daily’, and ‘monthly’ but not all of these time resolutions are available at every station.
Returns:	dict – a python dict with site codes as keys with values containing pandas DataFrames of available sensor numbers and metadata.
Return type:	a python dict

ulmo.cdec.historical.get_data(station_ids=None, sensor_ids=None, resolutions=None, start=None, end=None)¶

Downloads data for a set of CDEC station and sensor ids. If either is not provided, all available data will be downloaded. Be really careful with choosing hourly resolution as the data sets are big, and CDEC’s servers are slow as molasses in winter.

Usage example:

from ulmo import cdec
dat = cdec.historical.get_data(['PRA'],resolutions=['daily'])

Parameters:	station_ids (iterable of strings or `None`) sensor_ids (iterable of integers or `None`) – check out or use the `get_sensors()` function to see a list of available sensor numbers resolutions (iterable of strings or `None`) – Possible values are ‘event’, ‘hourly’, ‘daily’, and ‘monthly’ but not all of these time resolutions are available at every station.
Returns:	dict – a python dict with site codes as keys. Values will be nested dicts containing all of the sensor/resolution combinations.
Return type:	a python dict

Lower Colorado River Authority (LCRA)¶

LCRA Hydromet Data¶

Access to hydrologic and climate data in the Colorado River Basin (Texas) provided by the Hydromet web site and web service from the Lower Colorado River Authority.

ulmo.lcra.hydromet.get_sites_by_type(site_type)¶

Gets list of the hydromet site codes and description for site.

Parameters:	site_type (str) – In all but lake sites, this is the parameter code collected at the site. For lake sites, it is ‘lake’. See `site_types` and `PARAMETERS`
Returns:	sites_dict – A python dict with four char long site codes mapped to site information.
Return type:	dict

ulmo.lcra.hydromet.get_site_data(site_code, parameter_code, as_dataframe=True, start_date=None, end_date=None, dam_site_location='head')¶

Fetches site’s parameter data

Parameters:

site_code (str) – The LCRA site code (four chars long) of the site you want to query data for.
parameter_code (str) – LCRA parameter code. see PARAMETERS
start_date (None or datetime) – Start of a date range for a query.
end_date (None or datetime) – End of a date range for a query.
as_dataframe (True (default) or False) – This determines what format values are returned as. If True (default) then the values will be a pandas.DataFrame object with the values timestamp as the index. If False, the format will be Python dictionary.
dam_site_location (‘head’ (default) or ‘tail’) – The site location relative to the dam.

Returns:

df (pandas.DataFrame or)
values_dict (dict)

ulmo.lcra.hydromet.get_all_sites()¶: Returns list of all LCRA hydromet sites as geojson featurecollection.

ulmo.lcra.hydromet.get_current_data(service, as_geojson=False)¶

fetches the current (near real-time) river stage and flow values from LCRA web service.

Parameters:

service (str) – The web service providing data. see current_data_services. Currently we have GetUpperBasin and GetLowerBasin.
as_geojson (‘True’ or ‘False’ (default)) – If True the data is returned as geojson featurecollection and if False data is returned as list of dicts.

Returns:

current_values_dicts (a list of dicts or)
current_values_geojson (a geojson featurecollection.)

LCRA Water Quality Data¶

Access to water quality data in the Colorado River Basin (Texas) provided by the Water Quality web site and web service from the Lower Colorado River Authority.

ulmo.lcra.waterquality.get_sites(source_agency=None)¶

Fetches a list of sites with location and available metadata.

Parameters:	source_agency (str) – LCRA used code of the that collects the data. There are sites whose sources are not listed so this filter may not return all sites of a certain source. See `source_map`.
Returns:	sites_geojson
Return type:	geojson FeatureCollection

ulmo.lcra.waterquality.get_historical_data(site_code, start=None, end=None, as_dataframe=False)¶

Fetches data for a site at a given date.

Parameters:	site_code (str) – The site code to fetch data for. A list of sites can be retrieved with `get_sites()` date (`None` or date (see note on dates and times)) – The date of the data to be queried. If date is `None` (default), then all data will be returned. as_dataframe (bool) – This determines what format values are returned as. If `False` (default), the values dict will be a dict with timestamps as keys mapped to a dict of gauge variables and values. If `True` then the values dict will be a pandas.DataFrame object containing the equivalent information.
Returns:	data_dict – A dict containing site information and values.
Return type:	dict

ulmo.lcra.waterquality.get_recent_data(site_code, as_dataframe=False)¶

fetches near real-time instantaneous water quality data for the LCRA bay sites.

Parameters:	site_code (str) – The bay site to fetch data for. see real_time_sites as_dataframe (bool) – This determines what format values are returned as. If `False` (default), the values will be list of value dicts. If `True` then values are returned as pandas.DataFrame.
Returns:	list of values or dataframe.
Return type:	list

Texas Weather Connection Daily Keetch-Byram Drought Index (KBDI)¶

ulmo.twc.kbdi.core¶

This module provides direct access to Texas Weather Connection - Daily Keetch-Byram Drought Index (KBDI) dataset.

ulmo.twc.kbdi.get_data(county=None, start=None, end=None, as_dataframe=False, data_dir=None)¶

Retreives data.

Parameters:	county (`None` or str) – If specified, results will be limited to the county corresponding to the given 5-character Texas county fips code i.e. 48???. end (`None` or date (see note on dates and times)) – Results will be limited to data on or before this date. Default is the current date. start (`None` or date (see note on dates and times)) – Results will be limited to data on or after this date. Default is the start of the calendar year for the end date. as_dataframe (bool) – If `False` (default), a dict with a nested set of dicts will be returned with data indexed by 5-character Texas county FIPS code. If `True` then a pandas.DataFrame object will be returned. The pandas dataframe is used internally, so setting this to `True` is a little bit faster as it skips a serialization step. data_dir (`None` or directory path) – Directory for holding downloaded data files. If no path is provided (default), then a user-specific directory for holding application data will be used (the directory will depend on the platform/operating system).
Returns:	data – A dict or pandas.DataFrame representing the data. See the `as_dataframe` parameter for more.
Return type:	dict or pandas.Dataframe

US Army Corps of Engineers (USACE) - Tulsa District Water Control¶

Access to data provided by the United States Army Corps of Engineers - Tulsa District Water Control web site.

ulmo.usace.swtwc.get_stations()¶

Fetches a list of station codes and descriptions.

Returns:	stations_dict – a python dict with station codes mapped to station information
Return type:	dict

ulmo.usace.swtwc.get_station_data(station_code, date=None, as_dataframe=False)¶

Fetches data for a station at a given date.

Parameters:	station_code (str) – The station code to fetch data for. A list of stations can be retrieved with `get_stations()` date (`None` or date (see note on dates and times)) – The date of the data to be queried. If date is `None` (default), then data for the current day is retreived. as_dataframe (bool) – This determines what format values are returned as. If `False` (default), the values dict will be a dict with timestamps as keys mapped to a dict of gauge variables and values. If `True` then the values dict will be a pandas.DataFrame object containing the equivalent information.
Returns:	data_dict – A dict containing station information and values.
Return type:	dict

ulmo Readers¶

note on dates and times¶

Readers for Global to USA-national data¶

Climate Prediction Center (CPC) Weekly Drought¶

CUAHSI Hydrologic Information System (HIS)¶

CUAHSI HIS Central¶

CUAHSI WaterOneFlow (WOF)¶

NASA ORNL Daymet weather data services¶

National Climatic Data Center (NCDC)¶

NCDC Climate Index Reference Sequential (CIRS)¶

NCDC Global Historical Climate Network (GHCN) Daily¶

NCDC Global Summary of the Day (GSoD)¶

NOAA GOES Data Collection System (DCS) services¶

USGS National Water Information System (NWIS)¶

USGS National Elevation Dataset (NED) raster services¶

Readers for USA regional (sub-national) data¶

California Department of Water Resources Historical Data¶

Lower Colorado River Authority (LCRA)¶

LCRA Hydromet Data¶

LCRA Water Quality Data¶

Texas Weather Connection Daily Keetch-Byram Drought Index (KBDI)¶

ulmo.twc.kbdi.core¶

US Army Corps of Engineers (USACE) - Tulsa District Water Control¶

Table of Contents

Previous topic

This Page