Calibration analysis - SRG

Overview

Description and rationale

Source includes a number of optimisation techniques and statistical measures for automated model calibration and to assist modellers with the evaluation of the quality of calibration. These are mainly intended for calibrating catchment rainfall-runoff models, but are also applicable when calibrating river system models (e.g. see Lerat et al., 2013). The available automatic optimisation algorithms are:

Shuffled complex evolution
Uniform random sampling
Rosenbrock method

Modellers have the option of selecting one optimisation technique or combinations two optimisation techniques (in series).

Automated calibration requires the use of an objective function to direct the optimisation process. The Source calibration tool implements single objective optimisation, which reduces the comparison between the observed and modelled data during the calibration period to a single number to be optimised (multiple objective optimisation is also available, see Multi-objective optimisation - Insight - SRG for information).

Source implements five different basic types of objective function:

Nash Sutcliffe Coefficient of Efficiency (NSE)
Flow duration (specifically, the NSE of the flow duration)
Absolute bias
Bias penalty
Square-root daily, exceedance and bias

The NSE can be applied to daily or monthly data, and the NSE and flow duration objectives can be applied to data that has been transformed by taking the logarithm. Source also allows the user to create some composite objective functions, of which there are two types:

Combinations of the individual objective functions listed above. For example, the objective for calibrating streamflow at a gauging site could be a combination of the NSE and bias penalty.
Combinations of the objectives for different model outputs. For example, a model could be calibrated using a weighted combination of the objective functions values at two or more different gauging sites.

Scale

Typically, the optimisation techniques and statistical measures are used to compare observed and estimated data at a point, such as streamflow data at a gauging station. Both the optimisation techniques and statistical measures can be applied on a daily or monthly basis.

Scientific Provenance

The statistical measures used in Source are well established. They are described in statistics textbooks, hydrology textbooks and papers such as Aitken (1973) and Nash and Sutcliffe (1970).

Overview information on the four optimisation techniques in Source is available in Vaze et al. (2011). Further information is in textbooks and papers, particularly for the genetic algorithm and uniform random sampling[DB1] . Publications on the shuffled complex evolution method include papers by Duan et al. (1992) and Sorooshian et al. (1993). Publications on the Rosenbrock method include the paper by Rosenbrock (1960).

Version

Source version 3.8.8.

Dependencies

Requires observed data suitable for comparison of results from model calibration runs.

Availability

Provided with Source.

Some of these objective functions can be combined to create composite objective functions. For composite objective functions, the user is often able to enter a weight that determines the relative contribution of each objective function component to the

Nash Sutcliffe Coefficient of Efficiency (NSE) of Daily Flows
Minimise Absolute Bias between Observed and Modelled Flows (calculated using daily flows)
Match to NSE of Daily Flows but Penalise Biased Solutions
Match to NSE of Monthly Flows
Match to NSE of Monthly Flows but Penalise Biased Solutions
Combined Match to NSE and Match to Flow Duration Curve (Daily)
Combined Match to NSE and Match to Logarithm of Flow Duration Curve (Daily)
Combined Match to NSE of Logarithms of Daily Flows with Bias Penalty
Combined Bias, Daily Flows and Daily Exceedance (Flow Duration) Curve (SDEB)

Further information on the first seven of these objective functions is available in Vaze et al. (2011), Section 6. Guidance on model calibration is available in many publications, including various eWater Best Modelling Practice Guidelines (Black et al., 2011; Vaze et al., 2011; Black and Podger, 2012; and Lerat, 2012).

Structure & processes

Background

As the optimisation techniques and statistical measures of calibration performance used in Source are well established, they are not re-described here. However, as the objective functions used in the optimisation techniques have been customised for Source, further information on these follows and as many of them rely on the Nash Sutcliffe Coefficient of Efficiency (NSE), its formulation is restated below.

The choice of any particular objective function will depend on the intended application. Each of the pre-defined objective functions are formulated to put emphasis (reproduce as closely as possible) on different flow characteristics (Vaze et al, 2011).

The discussion below assumes that the objective functions are being applied to streamflow data but they can be applied to any time series data.

Objective Function Name	Description
NSE Daily	Maximise the NSE of daily flows
NSE Monthly	Maximise the NSE of monthly flows
NSE Log Daily	Maximise the NSE of the logarithm of daily flows
Absolute Bias	Minimise the Absolute value of the relative bias
NSE Daily & Bias Penalty	Maximise the NSE of daily flows and bias penalty
NSE Log Daily & Bias Penalty	Maximise the NSE of the logarithm of daily flows and bias penalty
NSE Monthly & Bias Penalty	Maximise the NSE of monthly flows and bias penalty
NSE Daily & Flow Duration	Maximise the NSE of daily flows and the NSE of the flow duration
NSE Daily & Log Flow Duration	Maximise the NSE of daily flows and the NSE of the flow duration of log flows
Square-root Daily, Exceedance and Bias

Missing Data

It is common for observed time series of hydrological processes to contain missing values. Also, the observed and modelled time series may have different start and end dates. The Source calibration tool calculates the objective function values using only data from those time steps for which both observed and modelled data is available.

The descriptions of the objective function equations assume that the observed and modelled data has been filtered to include only:

data from within the calibration period, and
data for time steps with complete data pairs.

Nash Sutcliffe Coefficient of Efficiency (NSE)

NSE Daily

Application of this objective function involves maximising the NSE (i.e. getting it as close to 1.0 as possible). The calculation of the NSE is in accordance with Nash and Sutcliffe (1970) and uses observed and modelled daily flow data for all days within the calibration period for which observed daily flow data, including zero flow values (i.e. cease to flow), is available.

The NSE tends to produce solutions that match high and moderate flows very well but often will produce poor fits to low flows. It will also tend to favour solutions that provide a good match to the timing and shape of runoff events (Vaze et al., 2011).

The traditional formula for the NSE is:

Equation 1

where:

Qobs_i is the observed flow on day i,

Qsim_i is the modelled flow on day i,

N is the number of days

Alternatively, the NSE may be written as:

Equation 2

This formulation obviates the necessity to calculate the average of the observed flows before evaluating the denominator in the traditional version.

NSE Log Daily

This objective function uses the same equation as for the NSE of daily flows (equation (1)), but applies it to log transformed data:

Equation 3

where c is a small constant equal to the maximum of 1 ML and the 10^th percentile of the observed flow.

NSE Monthly

This objective function uses the same equation as for the NSE of daily flows (equation (1)), but applies it to monthly rather than daily data:

If the model is run on a daily time step, Qobs_ibecomes the sum of the observed flows for month i and Qsim_i becomes the sum of the modelled flow for month i. The NSE calculation ignores observed and modelled data for all months where there are one or more days of missing data in the observed flow series.
If the model is run on a monthly time step, then the monthly values are unchanged.

The NSE of monthly flows can be useful for initial calibration because it tends to find solutions that will match the overall movement of water through the conceptual stores in the rainfall-runoff model, without being influenced by the timing of individual runoff events (Vaze et al., 2011).

Flow Duration

The flow duration objective function sorts the observed and modelled data values in increasing order and then calculates the NSE of the sorted data.

Log Flow Duration

This objective function calculates the flow duration objective function using the log transformed flows in Equation (3).

Absolute Bias

This objective function will produce a match on the overall volume of flow generated but often will produce a poor fit to the timing of flows (Vaze et al., 2011). It has the following form:

Equation 3

The evaluation of this objective function uses observed and modelled daily flow data for all days within the calibration period for which observed daily flow data, including zero flow values, is available.

Bias Penalty

The bias penalty objective function is described in Viney et al. (2009). The equation is given by:

Equation 4

where B is the absolute value of the relative bias, as defined in equation (3).

Combinations of the NSE, Flow Duration and Bias Penalty Objective Functions

The following nine forms of objective function are available in Source:

Minimise Absolute Bias between Observed and Modelled Flows (calculated using daily flows)
Match to NSE of Daily Flows but Penalise Biased Solutions
Match to NSE of Monthly Flows
Match to NSE of Monthly Flows but Penalise Biased Solutions
Combined Match to NSE and Match to Flow Duration Curve (Daily)
Combined Match to NSE and Match to Logarithm of Flow Duration Curve (Daily)
Combined Match to NSE of Logarithms of Daily Flows with Bias Penalty
Combined Bias, Daily Flows and Daily Exceedance (Flow Duration) Curve (SDEB)

3. Match to NSE of Daily Flows but Penalise Biased Solutions

This objective function is a weighted combination of the daily NSE and a logarithmic function of bias based on Viney et al (2009), and the aim is to find its maximum value.

Equation 4

where:

B is the bias; and

Equation 5

The evaluation of this objective function uses observed and modelled daily flow data for all days within the calibration period for which observed daily flow data, including zero flow values, is available.

This formulation makes sure that the models are calibrated predominantly to optimise NSE while ensuring a low bias in the total streamflow. It avoids solutions that produce biased estimates of overall runoff, which can produce marginal improvements in low flow performance over the NSE objective function. However, NSE-Bias will still be strongly influenced by moderate and high flows and by the timing of runoff events, which can still often result in poor fits to low flows (Vaze et al, 2011).

4. Match to NSE of Monthly Flows

5. Match to NSE of Monthly Flows but Penalise Biased Solutions

This objective function is the weighted combination of the monthly NSE and a logarithmic function of bias (Viney et al, 2009), and the aim is to find its maximum value. The equation used is the same as for the case “Match to NSE of Daily Flows but Penalise Biased Solutions” above. The NSE and Bias calculations ignore observed and modelled data for all months where there are one or more days of missing data in the observed flow series.

6. Combined Match to NSE and Match to Flow Duration Curve (Daily)

For this case the aim is to maximise the objective function, where:

Equation 6	*Objective function = A NSE daily daily flows + (1 - A) * NSE daily FDC**

where:

A is a weighting factor whose value can be set by the modeller (0 ≤ A ≤ 1); and

NSE daily FDC is calculated using ranked value pairs of Qobs_i and Qsim_i.

This objective function and the following objective function are hybrids that compromise between the fit to the timing of high and moderate flows from the NSE component and the fit to the shape of the whole flow duration curve (FDC). The NSE-logFDC (below) will produce the closer fit to low flows (Vaze et al, 2011).

7. Combined Match to NSE and Match to Logarithm of Flow Duration Curve (Daily)

For this case the aim is to maximise the objective function, where:

Equation 7	**Objective function = A NSE daily flows + (1 - A) * NSE log₁₀(daily FDC)***

where:

A is a weighting factor whose value can be set by the modeller (0 ≤ A ≤ 1);

NSE log₁₀(daily FDC) is calculated using ranked value pairs of log₁₀(Qobs_i+c) and log₁₀(Qsim_i+c).

c is the maximum of 1 ML and the 10^th percentile of the observed flows. The use of this constant is intended to de-emphasise very small flows, which tend to be unreliable, and overcome the problem of trying to take logarithms of zero flows.

8. NSE Log Daily & Bias Penalty Objective Function

This objective function is given by:

Equation 8	*Objective function = NSE(logarithms of daily flows) – Bias Penalty*

NSE(logarithms of daily flows) is calculated using value pairs of ln(Qobs_i+c) and ln(Qsim_i+c), where B and v are defined in the same way as above. The Bias Penalty is based on Viney et al (2009) and is:

Equation 9

This objective function captures the model’s ability to fit the shape of the observed daily flow hydrograph, with an emphasis on mid-range to low flows (in contrast to the arithmetic form of the NSE which tends to put an emphasis on medium to high flows), while ensuring a low bias in the total streamflow.

Combined Bias, Daily Flows and Daily Exceedance (Flow Duration) Curve (SDEB)

This objective function is based on the function introduced by Coron et al. (2012) and has been successfully applied in a number of projects (e.g. Lerat et al., 2013). It has the following equation:

Equation 10

where:

α is a weighting factor whose value can be set by the modeller (0 ≤ α ≤ 1).

RQobs_k is the k’th ranked observed flow of a total of N ranked flows,

RQsim_k is the k’th ranked modelled flow of a total of N ranked flows, and

Other terms are as defined previously.

As explained by Lerat et al. (2013), this function combines three terms: (i) the sum of squared errors on power transform of flow, (ii) the same sum on sorted flow values and (iii) the relative simulation bias.

The coefficient α and the power transform are used to balance the three terms within the objective function. The weighting factor α is used to reduce the impact of the timing errors on the objective function. This type of error can have a significant effect on the first term in the equation, where a slight misalignment of observed and simulated peak flow timing can result in large amplitude errors. Conversely, the second term is based on sorted flow values, which remain unaffected by timing errors. By way of example, Lerat et al (2013) in their study of the Flinders and Gilbert Rivers in Northern Australia used values of α of 0.1 for the Flinders calibration and 1.0 for the Gilbert calibration.

Using values of power transform of less than 1 has the effect of reducing the weight of the errors in high flows, where the flow data are known to be less accurate. Lerat et al (2013) found that a power transform of ½ led to the best compromise between high and low flow performance in their project. This value has been adopted in Source.

Data

Input data

Details on data to be input by the modeller are provided in the Source User Guide. Requirements for data series inputs to the various objective functions are included in the descriptions of each objective function, above.

Parameters or settings

Modellers have the option of selecting one optimisation technique, multiple optimisation techniques (in parallel), or combinations two optimisation techniques (in series), or not using optimisation. Modellers can also select which objective function they wish to use. The other parameters the modeller can input are described in the following table:

Parameter	Description	Units	Default	Range
A	Weighting factor for the objective function in cases 6 and 7	Dimensionless	0.5	0 ≤ A ≤ 1
α	Weighting factor for the objective function in case 9	Dimensionless	0.5	0 ≤ α ≤ 1

Output data

Outputs include results of the evaluation of the selected objective function and other calibration performance statistics.

Reference list

Aitken, A.P. (1973). Assessing systematic errors in rainfall-runoff models. J. Hydrol, 20, 121–136.

Black, D.C. and Podger, G.M. (2012). Guidelines for modelling water sharing rules in eWater Source: towards best practice model application. eWater Cooperative Research Centre, Canberra, Australia. July. ISBN: 978-1-921543-74-6. Available via: www.ewater.com.au.

Black, D.C., Wallbrink, P.J., Jordan, P.W., Waters, D., Carroll, C., and Blackmore, J.M. (2011). Guidelines for water management modelling: towards best practice model application. eWater Cooperative Research Centre, Canberra, Australia. September. ISBN: 978-1-921543-46-3. Available via: www.ewater.com.au.

Coron, L., Andrassian, V., Perrin, P., Lerat, J., Vaze, J., Bourqui, M., and Hendrickx, F. (2012) Crash testing hydrological models in contrasted climate conditions: an experiment on 216 Australian catchments. Water Resources Research, 48, W05552, doi:10.1029/ 2011WR011721.

Duan, Q., Sorooshian, S. and Gupta, V. (1992). Effective and Efficient global optimization for conceptual rainfall-runoff models. Water Resources Research, 28(4), 1015-1031.

Lerat, J. (2012). Towards the adoption of uncertainty assessment in water resources models: the eWater Source uncertainty guideline. Proceedings of the 34^th Hydrology and Water Resources Symposium, 19-22 November 2012, Sydney, NSW.

Lerat, J., Egan, C. A., Kim, S., Gooda, M., Loy, A., Shao, Q., and Petheram, C. (2013). Calibration of river models for the Flinders and Gilbert catchments. A technical report to the Australian Government from the CSIRO Flinders and Gilbert Agricultural Resource Assessment, part of the North Queensland Irrigated Agriculture Strategy. CSIRO Water for a Healthy Country and Sustainable Agriculture flagships, Australia.

Nash, J.E. and Sutcliffe, J.V. (1970). River flow forecasting through conceptual models, I, A discussion of principles. J. Hydrol, 10, 282–290.

Rosenbrock, H.H. (1960). An automated method of finding the greatest of least value of a function. The Computer Journal, 3, 303-307.

Sorooshian, S., Duan, Q. and Gupta, V. (1993). Calibration of rainfall-runoff models: application of global optimization to the Sacramento Soil Moisture Accounting Model. Water Resources Research, 29(4), 1185-1194.

Vaze, J., Jordan, P., Beecham, R., Frost, A., Summerell, G. (2011). Guidelines for rainfall-runoff modelling: Towards best practice model application. eWater Cooperative Research Centre, Canberra, ACT. ISBN 978-1-921543-51-7. Available via www.ewater.com.au.

Viney, N.R., Perraud, J-M., Vaze, J., Chiew F.H.S., Post, D.A. and Yang, A. (2009). The usefulness of bias constraints in model calibration for regionalisation to ungauged catchments. In: 18th World IMACS Congress and MODSIM09 International Congress on Modelling and Simulation, July 2009, Cairns: Modelling and Simulation Society of Australian and New Zealand and International Association for Mathematics and Computers in Simulation: 3421-3427.