Replicate Analysis

The Replicate Analysis run configuration in Source allows users to run the same model scenario multiple times using input time series replication. The replicate analysis functionality uses a cycled input data concept in which the data for the first year in the simulation is shifted progressively by a year or selected year increments. In this way multiple instances of any Source model can be run using input data derived using the cycled data concept, allowing extraction of results for each replicate. This is useful when the user wants to get an idea of the risk associated with the decision variables. For instance, 'Replicate Analysis' can be used to assess the risk of spill in reservoirs under varying input climate conditions. 

Once the user has defined the required number of replicates and increment and selected the data source to be cycled, the Source model runs for each replicate and provides an output for each replicate in the Results Manager. For example, if the data source selected for cycling is time series of rainfall data, it is used for multiple simulations using the same model scenario to produce the required outputs such as downstream flows. In this way the user can analyse the effect of changes in rainfall patterns on the downstream flows. 

Replicate analysis run configuration

To use this functionality, first the user has to select the 'Replicate analysis' option from the run configuration drop-down list on the Simulation toolbar (Figure 1). 

Figure 1. Simulation toolbar- Replicate analysis

If a new version of Source is installed and used for the first time, the 'Replicate analysis' option has to be added using the 'Add new configuration' tab of 'Edit>Scenario Options>Running Configurations'.

Run configuration

Once the 'Replicate analysis' is selected, the user can configure the replicate run. There are two new configuration tabs associated with this functionality, replacing the standard Run Configuration tab for a ‘Single Analysis’ run: 'Run Configuration' and 'Data Sources Configuration'. The 'Run Configuration' tab of the configuration window appears as in Figure 2.

                                                                                                             Figure 2. Replicate Analysis Configuration Window- Run Configuration

Using this configuration setup, users can configure multi-replicate simulation with multiple data sources and be able to specify start and end dates of the model run period, number of replicates and the start date for replicates and the interval between consecutive replicates (Replicate increment). The 'Start Date' and 'End Date' are the model run dates against which the cycled data from the data sources are reported. The start and end dates remain the same for all replicates. The 'Start Cycle Date' is the first data date that will be used to generate replicate data. 

The description for each run configuration parameter is given in the Table 1 below.

Table 1. Run Configuration parameters and their description. 

Option

Description

Example

Replicates

The number of times the model will be run.

5

Increment

The number of years to increment the selected replicate input data.

1

Start Cycle Date

The first data date that will be used to generate replicate data.

1/11/1891

Start Date

The start date for the model run period. 

01/11/1891

End Date

The end date of the model run period. 

30/06/1892


The above run configuration as in Figure 2 can be explained using the figure (Figure 3) below. 

Figure 3. Explanation of the above replicate run configuration


Another example of the replicate run configuration is shown in Figure 4. In this configuration, the start and end date covers multiple years on a monthly basis. The input data starting from 1/01/1957 are cycled and placed against start and end dates of 1/01/1990 and 31/12/2000 respectively.

Figure 4. Another replicate analysis run configuration


The above run configuration can be explained using the Figure 5 below. 

Figure 5. Explanation of the above run configuration example

Data sources configuration

When running a replicate analysis, the user can decide whether to cycle all the data sources or only selected data sources. For this, the 'Data Sources Configuration' tab is used as shown in Figure 6. All data sources are cycled in the first example configuration, whereas  in the second example configuration, only one data source (Inflow_Crab_Creek.csv) is cycled.

Figure 6. Replicate Analysis Configuration Window- Data Sources Configuration


Figures 7, 8 and 9 below further illustrate different replicate analysis configurations and resulting replicate outputs. The symbols in the figure represent the yearly or sub-yearly values/data corresponding to the original data dates. The start and end dates are the same for all replicates.

In Figure 7, the replicate increment is 1 year and original data from 2002 to 2008 are cycled and reported against dates between 2002 (start date) and 2008 (end date). The Source model is run three times (Number of replicates = 3) from the start date (01/01/2002) till the end date (31/12/2008), and the input time series selected for replication is cycled, incrementing by 1 year each time. Replicate 1 starts with the original data of 2002 reported against the start date of 2002; Replicate 2 starts with the original data corresponding to 2003 and Replicate 3 starts with the original data from 2004.

Figure 7. Replicate analysis configuration and output with replicates =3, increment =1

For the replicate run configuration as in Figure 7, the data is reported as given in Table 2. 

Table 2. Model run year and replicated input data year for the three replicates

Replicate 1

Replicate 2

Replicate 3

Model run year

Replicated input data year

Model run year

Replicated input data year

Model run year

Replicated input data year

2002

2002

2002

2003

2002

2004

2003

2003

2003

2004

2003

2005

2004

2004

2004

2005

2004

2006

2005

2005

2005

2006

2005

2007

2006

2006

2006

2007

2006

2008

2007

2007

2007

2008

2007

2002

2008

2008

2008

2002

2008

2003


Figure 8 has the same original data as in Figure 7 with original data from 2002 to 2008 cycled and reported against dates between 2002 (start date) and 2008 (end date), but with a replicate increment of 2 and a Start Cycle date of 2004 The model is run twice (number of replicates is 2) from the start date (01/01/2002) till the end date (31/12/2008), and the selected input time series is replicated by an increment of 2 years starting in 2004. Replicate 1 starts with original data from 2004, as this is the 'Start Cycle Date'. The second replicate starts with data from 2006 as the increment is two.

Figure 8. Replicate analysis configuration and output with replicates =2, increment =2

For the above configuration (Figure 8), the data is reported as given in Table 3. 

Table 3. Model run year and replicated input data year for the two replicates

Replicate 1

Replicate 2

Model year

Replicated input data year

Model year

Replicated input data year

2002

2004

2002

2006

2003

2005

2003

2007

2004

2006

2004

2008

2005

2007

2005

2002

2006

2008

2006

2003

2007

2002

2007

2004

2008

2003

2008

2005


Figure 9 illustrates that the model can be run and cycled on a subset of the available input data. In this example it is assumed original input data starts in 2000 and ends in 2010 but the modeller only requires a replicate analysis for 2002 – 2008. Selected input data are cycled and reported against dates from 2002 (start date) to 2008 (end date). In this example the number of replicates is still 2, but the Start Cycle date is 01/01/2005. The model is run twice (2 replicates) from 01/01/2002 to 31/12/2008, and the selected input time series is replicated by an increment of 2 years starting in 2005. Replicate 1 starts with original data from 2005, as it is the 'Start Cycle Date'. The second replicate starts with data from 2007 as the increment is two.

Figure 9. Replicate analysis configuration and output with replicates = 2, increment = 2

For the above configuration (Figure 9), the data is reported as given in Table 4. 

Table 4. Model run year and replicated input data year for the two replicates

Replicate 1

Replicate 2

Model year

Replicated input data year

Model year

Replicated input data year

2002

2005

2002

2007

2003

2006

2003

2008

2004

2007

2004

2002

2005

2008

2005

2003

2006

2002

2006

2004

2007

2003

2007

2005

2008

2004

2008

2006

Results

The replicate analysis run will produce a single run with sub-runs for each replicate run as appears in the Results Manager.  Results of the run and all sub-runs can be exported to res.csv or Source Db format. Individual sub-run results can be exported to res.csv, Source Db or other formats.


The replicate analysis is further explained by using some Source model examples.

Example 1

Consider an example Source model as shown in Figure 10. The input data sources are time series monthly rainfall and two monthly demands and the data period spans from 1/01/1957 to 1/12/2003 for all of the data sources. 

Figure 10. Example model and the replicate analysis configuration

As shown in Figure 10, for each replicate, the model will be run for a 10 year period starting in 1/01/1990 and ending in 31/12/2000. Though any time step can be used, in this example, a monthly time step is considered as the input data time step is monthly. The number of replicates is taken as 5 with an increment of 1 year. The input data  is cycled starting in 1/01/1957. Only rainfall data is cycled. 

Once the configuration is set up, the model run produces the results as shown in Figure 11 in the Results Manager 'Table' format. It can be seen from the left  side of Results Manager that the scenario results are provided as five replicates as sub-runs. The sub-run names correspond to the start cycle year of each replicate with an increment of  one year.

The 'Date' column in the table (right side of the figure) corresponds to the 'Start Date' and 'End Date' and dates in between. These dates would be the same for all replicates. Each column to the right of 'Date' column represents each replicate (sub-run) of the data source 'Rainfall'. It can be seen from the table that for the first replicate, the values (data) for 1990 corresponds to the original values of 1957, whereas in the second replicate, the values for 1990 corresponds to that of 1958 and so on. For the fifth replicate, the values for 1990 is replaced with values corresponding to 1961.

Figure 11. Replicate run results as in Results Manager


The below table (Table 5) shows how the original data dates are cycled in the above example. 

Table 5. Reported date and the corresponding original data dates

Replicate run name

1957 (Replicate 1)

1958 (Replicate 2)

1959 (Replicate 3)

1960 (Replicate 4)

1961 (Replicate 5)

Reported starting date

Original data starting date

Original data starting date

Original data starting date

Original data starting date

Original data starting date

1990

1957

1958

1959

1960

1961

1991

1958

1959

1960

1961

1962

1992

1959

1960

1961

1962

1963

1993

1960

1961

1962

1963

1964

.

.

.

.

.

.

.

.

.

.

.

.

2000

1967

1968

1969

1970

1971

Example 2

Consider another example which illustrates the ability of the Replicate Analysis functionality to cycle part of yearly data and report against the same start date. In the Replicate Analysis configuration as in Figure 12, the cycle date starts on 1/05/1998 with 15 replicates and increment of one year. The 'Start Date' has the same date as that of 'Start Cycle Date', while the 'End Date' is 30/11/1998. Therefore, the reporting period is only seven months. The data source (Creek_Inflows) period spans from 11/01/1998 to 31/12/2013 (seven months). 

Figure 12. Replicate Analysis Configuration with partial year data cycling 

The model run results in 15 replicates, each having seven months of data (from 1st of May to 30th November) from each year between 1998 and 2012 reported against 1/05/1998 to 30/11/1998. 

In Figure 13, the sub-run names indicate each replicate with seven months of data. The right side of the figure shows that three replicates with data (from 1st May to 30th November) corresponding to start cycle dates as 1998, 1999 and 2000 are reported against 01 May 1998 and 30 Nov 1998. 

Figure 13. Replicate run results showing 15 replicates with partial year data cycled between 1/05/1998 and 30/11/1998


It should be noted that the number of replicates should be less than or equal to the number of years of data in the data source. 

The below table (Table 6) shows how the replicate analysis works in the above example.

Table 6. Reported date and the corresponding original data dates

Reported date1998 (Replicate 1)1999 (Replicate 2)2000 (Replicate 3)......2012 (Replicate 15)
Reported starting dateOriginal data starting dateOriginal data starting dateOriginal data starting date......Original data starting date
1/05/1998 - 30/11/19981/05/1998 - 30/11/19981/05/1999 - 30/11/19991/05/2000 - 30/11/2000......1/05/2012 - 30/11/2012