4.2.6. HDF5 File Methods and Data Structures

mHSP2 reads the input HDF5 file and stores the pertinent portions in memory before starting the main time loop. At the end of simulation time, mHSP2 reopens this same HDF5 file and writes the specified time series outputs.

4.2.6.1. locaHSP2HDF5.py

Routines and data structures for processing HSPsqured HDF5 file from the model input standpoint.

Use this module to isolate the reading of the input HSP2 HDF5 file and corresponding model setup. The reorganization of this program, relative to HSPsquared, so that the main loop is the time loop means that we do not want to keep the HDF5 file open for the entire simulation. Additionally are not providing full HSPF-functionality support at this time and need to identify exactly what is supported and what is not supported for the user.

This module contains customizations to work with two different HDF5 file formats. The original HSPsquared HDF5 format is for the 2.7 version that was the primary HSPsquared version prior to March 2020. A 3.6+ version of HSPsquared was released in March-April 2020. This updated version has different HDF5 file format.

locaHSP2HDF5.ALLOPSEQ = None

Structured array or recarray to hold the operational sequence.

This is set off of the HDF5 file in locaHSP2HDF5.

locaHSP2HDF5.AOS_DTYPE = dtype([('TARGET', '<U6'), ('ID', '<U4'), ('SDELT', '<U4'), ('DELT', '<f4')])

Structured array specification type.

locaHSP2HDF5.DFCOL_OPSEQ_DELT = 'DELT'

Column name for time step in operational sequence DataFrame

locaHSP2HDF5.DFCOL_OPSEQ_ID = 'ID'

Column name for ID in operational sequence DataFrame

locaHSP2HDF5.DFCOL_OPSEQ_SDELT = 'SDELT'

Column name for string time step in operational sequence DataFrame

locaHSP2HDF5.DFCOL_OPSEQ_TARG = 'TARGET'

Column name for TARGET in operational sequence DataFrame

locaHSP2HDF5.GENERAL = {}

Replaces general in original HSPsquared formulations

locaHSP2HDF5.HDF_FMT = 0

HDF file format to read.

HSPsquared changed the HDF5 file format in 2020 with the release that was Python 3 compatible. Unfortunately, neither format is well documented. If this value is 0, then read in the original format. If > 0, then read in the new format.

locaHSP2HDF5.HSP2_TIME_FMT = '%Y-%m-%d %H:%M'

Time format for extraction from HSP2

locaHSP2HDF5.KEY_GEN_END = 'sim_end'

Key for simulation end time, in GENERAL dictionary

locaHSP2HDF5.KEY_GEN_START = 'sim_start'

Key for simulation start time, in GENERAL dictionary

locaHSP2HDF5.LINKDD = {}

Replaces linkdd which is the links database.

Data for LINK (combined NETWORK & SCHEMATIC) and MASSLINK information

locaHSP2HDF5.LOOKUP = {}

Replaces lookup.

Also not really used.

locaHSP2HDF5.MLDD = {}

Replaces mldd or the mass links.

Data for mass links.

locaHSP2HDF5.MONTHLYS = {}

Replaces monthlys, which are the dictionary of monthly tables.

Example: monthlys[‘PERLND’, ‘P001’][‘CEPSCM’]

locaHSP2HDF5.SEQUENCE = {}

Replaces sequence in original HSPsquared formulations

locaHSP2HDF5.TSDD = {}

Replaces tsdd which is the time series data structure.

Time series info default dictionary

locaHSP2HDF5.UCS = {}

Replaces ucs or user control.

Holds all default user control info in a dictionary

locaHSP2HDF5.XFLOWDD = {}

Replaces xflowdd.

This is not really used here.

locaHSP2HDF5.detHDF5Format(hdfname)

Determine the HDF5 format.

Need to know this to correctly read in the necessary values.

Parameters

hdfname (str) – HDF5 filename used for both input and output.

Returns

HDF5 file format; 0 == original format; 1 == new format

Return type

int

locaHSP2HDF5.getALLOPS()

Convenience function to return the module level global ALLOPSEQ

Returns

ALLOPSEQ

Return type

numpy.recarray

locaHSP2HDF5.getGENERAL()

Convenience function to return the module level global GENERAL

locaHSP2HDF5.getHDFFormat()

Get the integer format ID for this HDF5 file

Returns

HDF5 file format; 0 == original; 1 == new

Return type

int

locaHSP2HDF5.getLINKDD()

Convenience function to return the module level global LINKDD

locaHSP2HDF5.getMLDD()

Convenience function to return the module level global MLDD

locaHSP2HDF5.getMONTHLYs()

Get the dictionary of monthly values for the original or old HDF5 file format.

Returns

MONTHLYS, key is (type, targID) which returns

a dictionary as the value. The sub-dictionary has keys that are parameter names and values that are a tuple of size 12.

Return type

defaultdict

locaHSP2HDF5.getSEQUENCE()

Convenience function to get the module level global SEQUENCE

locaHSP2HDF5.getUCS()

Convenience function to return the module level global UCS

locaHSP2HDF5.getUNITS()

Get the units from the GENERAL dictionary.

Only works for the old format HDF5 file. Also, the units always have to be 1 because metric are not supported.

Returns

integer telling which units are specified; 1 == standard;

2 == metric

Return type

int

locaHSP2HDF5.getnUCI()

Get the nUCI dictionary. For new format HDF5 files this replaces the UCS dictionary.

Returns

nUCI, user control information from new format

Return type

defaultdict

locaHSP2HDF5.initialHDFRead(hdfname, reloadkeys)

Determine the HDF5 file format and then call the method to read that format.

Parameters
  • hdfname (str) – HDF5 filename used for both input and output.

  • reloadkeys (bool) – Regenerates keys, used after adding new modules.

Returns

function status, 0 == success

Return type

int

locaHSP2HDF5.nKEY_END = 'Stop'

New format key for simulation end time, in GENERAL dictionary

locaHSP2HDF5.nKEY_START = 'Start'

New format key for simulation start time, in GENERAL dictionary

locaHSP2HDF5.nUCI = {}

Replaces uci in new format HDF5 file.

Holds all default user control info in a dictionary

locaHSP2HDF5.newHDFRead(hdfname)

Logic to read the new format HDF file.

Extraction from main to read everything that needed from HDF5 file. Stores these main items now in module globals rather than keeping the file open.

Parameters

hdfname (str) – HDF5 filename used for both input and output.

Returns

function status, 0 == success

Return type

int

locaHSP2HDF5.origHDFRead(hdfname, reloadkeys)

Logic to read the original HDF file format.

Extraction from main to read everything that needed from HDF5 file. Stores these main items now in module globals rather than keeping the file open.

Parameters
  • hdfname (str) – HDF5 filename used for both input and output.

  • reloadkeys (bool) – Regenerates keys, used after adding new modules.

Returns

function status, 0 == success

Return type

int

locaHSP2HDF5.setGFTabDict(hdfname, tdict, gftab)

Set our global FTABLE dictionary which contains each defined FTABLE in the HDF5 file.

The keys are the FTAB name which is SVOLNO.

  • Only SVOL == “*” is supported

  • Only MFACTOR as a number is supported

Requirements: relies on TARG_DICT so must be called after checkOpsSpec

Parameters
  • hdfname (str) – FQDN path for the input HDF file

  • tdict (dict) – target dictionary from locaMain

  • gftab (dict) – FTAB dictionary from locaMain

Returns

function status; success == 0

Return type

int

locaHSP2HDF5.setGTSDict(hdfname, simtimeinds, map_dict, gts)

Set our global time series dictionary which contains each defined time series in the HDF5 file.

The keys of the dictionary are the ts name which is SVOLNO.

  • Only SVOL == “*” is supported

  • Only MFACTOR as a number is supported

Note that RCHRES COLIND and OUTDG have not been tested.

Parameters
  • hdfname (str) – FQDN path for the input HDF file

  • simtimeinds (dict) – SIMTIME_INDEXES from locaMain

  • map_dict (dict) – the mapping dictionary which will be modified here

  • gts (dict) – the global time series dictionary which will also be modified here.

Returns

function status; success == 0

Return type

int

locaHSP2HDF5.transform(ts, tindex, how)

Copy of transform method from HSP2squared.

Because we include this function do not need to import the package

Parameters
  • ts (pandas.DataFrame) – time series in pandas DataFrame format

  • tindex (pandas.DateTimeIndex) – time index to use with the time series

  • how (str) – method for interpolation

Returns

resampled time series

Return type

pandas.DataFrame