Replicate Analysis

The Replicate Analysis run configuration in Source allows users to run the same model scenario multiple times using input time series replication. The replicate analysis functionality uses a cycled input data concept in which the data for the first year in the simulation is shifted progressively by a year or selected year increments. In this way multiple instances of any Source model can be run using input data derived using the cycled data concept, allowing extraction of results for each replicate. This is useful when the user wants to get an idea of the risk associated with the decision variables. For instance, 'Replicate Analysis' can be used to assess the risk of spill in reservoirs under varying input climate conditions.

Once the user has defined the required number of replicates and increment and selected the data source to be cycled, the Source model runs for each replicate and provides an output for each replicate in the Results Manager. For example, if the data source selected for cycling is a time series of rainfall data, it is used for multiple simulations using the same model scenario to produce the required outputs, such as downstream flows. In this way, the user can analyse the effect of changes in rainfall patterns on the downstream flows.

Replicate analysis run configuration

To use this functionality, firstly the user has to select the 'Replicate analysis' option from the run configuration drop-down list on the Simulation toolbar (Figure 1).

Figure 1. Simulation toolbar- Replicate analysis

If a new version of Source is installed and used for the first time, the 'Replicate analysis' option has to be added using the 'Add new configuration' tab of 'Edit>Scenario Options>Running Configurations'.

Run configuration

Once the 'Replicate analysis' is selected, the user can configure the replicate run. There are two configuration tabs ('Run Configuration' and 'Data Sources Configuration') associated with this functionality, replacing the standard Run Configuration tab for a ‘Single Analysis’ run. The 'Run Configuration' tab of the configuration window appears in Figure 2.

Figure 2. Replicate Analysis Configuration Window- Run Configuration

Using this configuration setup, users can configure multi-replicate simulation with multiple data sources and be able to specify the start and end dates of the model run period, the number of replicates, the interval between consecutive replicates (Replicate Increment), the start cycle year for replicates, option for remove excess data and replicate name format. The 'Start Date' and 'End Date' are the model run dates against which the cycled data from the data sources are reported. The start and end dates remain the same for all replicates. The 'Start Cycle Date' is the first data date that will be used to generate replicate data, and its month and day are the same as those in Start Date.

The description for each run configuration parameter is given in Table 1 below.

Table 1. Run Configuration parameters and their description.

Option	Description	Example
Start Date	The start date for the model run period.	01/11/1891
End Date	The end date of the model run period.	30/06/1892
Time Step	The time step of run, which is the same as the time step in the source data	Daily/Monthly
Run Separate Networks in Parallel	Option to allow running separate networks parallelly
Max Replicates	Max replicate number allowed to be entered. It is calculated based on other input parameters.	128
Number of Replicates	The number of times (replicates) the model will be run. It can not be more than Max Replicates	5
Replicate Increment	The interval between consecutive replicates	1
Start Cycle Date	The first data date that will be used to generate the replicate data. The user only needs to select a year. Its month and day are the same as those in the Start Date	1/11/1891
Remove excess data	The option to truncate/ignore the partial data from the source data.
Replicate Name Format	The format option for the name shown in the results Three options: Replicate Start Year, Replate Number, and Replicate Number and Start Year	Replicate Number and Start Year (1:1891). (If the replicate number is more than 9, it could be 01:1891, 001:1891, etc.)

The above run configuration as in Figure 2 can be explained using the figure (Figure 3) below.

Figure 3. Explanation of the above replicate run configuration

The original data used in the replicate run are related to input parameters such as Start Cycle Date and Remove excess data. For example, the source data are from 01/01/1891 to 30/06/2020, and other entries are not changed in Figure 2 :

If 2018 is selected for the Start Cycle Date, the source data used for Replicate 1 to 5 will be those of 01/01/2018 -30/06/2019, 01/01/1891 -30/06/1892, 01/01/1892 -30/06/1893, 01/01/1893 -30/06/1894, and 01/01/1894 -30/06/1895. The partial data from 2000 was trimmed for the option "Remove excess data” before building the data for the replicate. 2019 data cannot make a whole replicate, and the next replicate after 1:2018 will start from 01/01/1891.
If 2019 is selected for the Start Cycle Date, the source data used for Replicate 1 to 5 will be those of 01/01/1891 -30/06/1892, 01/01/1892 -30/06/1893, 01/01/1893 -30/06/1894, and 01/01/1894 -30/06/1895 and 01/01/1895 -30/06/1896. The partial data from 2000 was trimmed for the option "Remove excess data” before building the data for the replicate. 2019 data cannot make a whole replicate, and the first replicate will start from 01/01/1891.
If 2000 is selected for the Start Cycle Date, the user interface will give an error message because the partial data in the year 2000 will be trimmed, and it cannot be used for the replicate.

The leap year in the model run period from the Start Date to the End date may not match the data/date in a replicate. Replicate Analysis handles the conflicts as follows, and the example is based on the data in Figure 2 and Figure 3:

If there is a leap year (e.g. 29/02/1892) in the model run period and the replicate data is not a leap year (e.g. 29/02/1893 in replicate name 2:1892), the mean original value from 28/02/1893 and 01/03/1893 will be assigned to 29/02/1892 for the model run.
If there is a leap year (e.g. 29/02/1892) in the model run period and the replicate data is also a leap year (e.g. 29/02/1896 in replicate name 5:1896), the original value on 29/02/1896 will be assigned to 29/02/1892 for the model run.
If there is no leap year in the model run period and the replicate data is a leap year, the original value in the replicate data on 29 February will be ignored in the model run.

The details about handling conflicts of the leap year in Replicate Analysis are given in Table 2.

Table 2. Data used in the leap year in Replicate Analysis

Model Run Period	Replicate Name and the original data used
Model Run Period	1:1891	2:1892	3:1893	4:1894	5:1895
1/11/1891	1/11/1891	1/11/1892	1/11/1893	1/11/1894	1/11/1895
……	……	……	……	……	……
28/02/1892	28/02/1892	28/02/1893	28/02/1894	28/02/1895	28/02/1896
29/02/1892	29/02/1892	(28/02/1893+1/03/1893)/2	(28/02/1894+1/03/1894)/2	(28/02/1895+1/03/1895)/2	29/02/1896
1/03/1892	1/03/1892	1/03/1893	1/03/1894	1/03/1895	1/03/1896
……	……	……	……	……	……
30/06/1892	30/06/1892	30/06/1893	30/06/1894	30/06/1895	30/06/1896

Figure 4 shows another example of the replicate run configuration. The Start Date and End Date cover multiple years on a monthly basis. The input data starting from 1/01/1957 are cycled and placed against the start and end dates of 1/01/1990 and 1/12/2000 ( only 1^stday format used for the monthly data) respectively.

Figure 4. Another example of Replicate Analysis Run Configuration

The above run configuration can be explained using Figure 5 below.

Figure 5. Explanation of the above run configuration example

Data sources configuration

When running a replicate analysis, the user can decide whether to cycle all the data sources or only selected data sources. For this, the 'Data Sources Configuration' tab is used as shown in Figure 6. All data sources are cycled in the left example configuration, whereas in the right example configuration, only one data source (Inflow_Crab_Creek.csv) is cycled.

Figure 6. Replicate Analysis Configuration Window- Data Sources Configuration

Figures 7, 8 and 9 below further illustrate different replicate analysis configurations and resulting replicate outputs. The symbols in the figure represent the yearly or sub-yearly values/data corresponding to the original data dates. The start and end dates are the same for all replicates.

In Figure 7, the replicate increment is 1 year and original data from 2002 to 2008 are cycled and reported against dates between 2002 (start date) and 2008 (end date). The Source model is run three times (Number of replicates = 3) from the start date (01/01/2002) till the end date (31/12/2008), and the input time series selected for replication is cycled, incrementing by 1 year each time. Replicate 1 starts with the original data of 2002 reported against the start date of 2002; Replicate 2 starts with the original data corresponding to 2003 and Replicate 3 starts with the original data from 2004.

Figure 7. Replicate analysis configuration and output with replicates =3, increment =1

For the replicate run configuration as in Figure 7, the data is reported as given in Table 3.

Table 3. Model run year and replicated input data year for the three replicates

Replicate 1		Replicate 2		Replicate 3
Model run year	Replicated input data year	Model run year	Replicated input data year	Model run year	Replicated input data year
2002	2002	2002	2003	2002	2004
2003	2003	2003	2004	2003	2005
2004	2004	2004	2005	2004	2006
2005	2005	2005	2006	2005	2007
2006	2006	2006	2007	2006	2008
2007	2007	2007	2008	2007	2002
2008	2008	2008	2002	2008	2003

Figure 8 has the same original data as in Figure 7 with original data from 2002 to 2008 cycled and reported against dates between 2002 (start date) and 2008 (end date), but with a replicate increment of 2 and a Start Cycle date of 2004 The model is run twice (number of replicates is 2) from the start date (01/01/2002) till the end date (31/12/2008), and the selected input time series is replicated by an increment of 2 years starting in 2004. Replicate 1 starts with original data from 2004, as this is the 'Start Cycle Date'. The second replicate starts with data from 2006 as the increment is two.

Figure 8. Replicate analysis configuration and output with replicates =2, increment =2

For the above configuration (Figure 8), the data is reported as given in Table 4.

Table 4. Model run year and replicated input data year for the two replicates

Replicate 1		Replicate 2
Model year	Replicated input data year	Model year	Replicated input data year
2002	2004	2002	2006
2003	2005	2003	2007
2004	2006	2004	2008
2005	2007	2005	2002
2006	2008	2006	2003
2007	2002	2007	2004
2008	2003	2008	2005

Figure 9 illustrates that the model can be run and cycled on a subset of the available input data. In this example, it is assumed original input data starts in 2000 and ends in 2010, but the modeller only requires a replicate analysis for 2002 – 2008. Selected input data are cycled and reported against dates from 2002 (start date) to 2008 (end date). In this example the number of replicates is still 2, but the Start Cycle date is 01/01/2005. The model is run twice (2 replicates) from 01/01/2002 to 31/12/2008, and the selected input time series is replicated by an increment of 2 years starting in 2005. Replicate 1 starts with original data from 2005, as it is the 'Start Cycle Date'. The second replicate starts with data from 2007 as the increment is two.

Figure 9. Replicate analysis configuration and output with replicates = 2, increment = 2

For the above configuration (Figure 9), the data is reported as given in Table 5.

Table 5. Model run year and replicated input data year for the two replicates

Replicate 1		Replicate 2
Model year	Replicated input data year	Model year	Replicated input data year
2002	2005	2002	2007
2003	2006	2003	2008
2004	2007	2004	2002
2005	2008	2005	2003
2006	2002	2006	2004
2007	2003	2007	2005
2008	2004	2008	2006

Results

The replicate analysis run will produce a single run with sub-runs for each replicate run as appears in the Results Manager. Results of the run and all sub-runs can be exported to res.csv or Source Db format. Individual sub-run results can be exported to res.csv, Source Db or other formats.

The replicate analysis is further explained by using some Source model examples.

Example 1

Consider an example Source model as shown in Figure 10. The input data sources are time series monthly rainfall and two monthly demands and the data period spans from 1/01/1957 to 1/12/2003 for all of the data sources.

Figure 10. Example model and the replicate analysis configuration

As shown in Figure 10, for each replicate, the model will be run for a 10-year period starting from 1/01/1990 and ending on 1/12/2000 (the first day format for monthly data). Though any time step can be used, in this example, a monthly time step is considered as the input data time step is monthly. The number of replicates is taken as 5 with an increment of 1 year. The input data is cycled starting in 1/01/1957. Only rainfall data is cycled.

Once the configuration is set up, the model run produces the results shown in Figure 11 in the Results Manager 'Table' format. It can be seen from the left side of Results Manager that the scenario results are provided as five replicates as sub-runs. The sub-run names correspond to the replicate number and the start cycle year of each replicate with an increment of one year.

The 'Date' column in the table (right side of the figure) corresponds to the 'Start Date' and 'End Date' and dates in between. These dates would be the same for all replicates. Each column to the right of the 'Date' column represents each replicate (sub-run) of the data source 'Rainfall'. It can be seen from the table that for the first replicate, the values (data) for 1990 correspond to the original values of 1957, whereas in the second replicate, the values for 1990 correspond to those of 1958 and so on. For the fifth replicate, the values for 1990 are replaced with values corresponding to 1961.

Figure 11. Replicate run results as in Results Manager

The below table (Table 6) shows how the original data dates are cycled in the above example.

Table 6. Reported date and the corresponding original data dates

Replicate run name	1957 (Replicate 1)	1958 (Replicate 2)	1959 (Replicate 3)	1960 (Replicate 4)	1961 (Replicate 5)
Reported starting date	Original data starting date	Original data starting date	Original data starting date	Original data starting date	Original data starting date
1990	1957	1958	1959	1960	1961
1991	1958	1959	1960	1961	1962
1992	1959	1960	1961	1962	1963
1993	1960	1961	1962	1963	1964
.	.	.	.	.	.
.	.	.	.	.	.
2000	1967	1968	1969	1970	1971

Example 2

Consider another example that illustrates the ability of the Replicate Analysis functionality to cycle part of yearly data and report against the same start date. In the Replicate Analysis configuration as in Figure 12, the cycle date starts on 1/05/1998 with 15 replicates and increment of one year. The 'Start Date' has the same date as that of 'Start Cycle Date', while the 'End Date' is 30/11/1998. Therefore, the reporting period is only seven months. The data source (Creek_Inflows) period spans from 11/01/1998 to 31/12/2013 (seven months).

Figure 12. Replicate Analysis Configuration with partial year data cycling

The model run results in 15 replicates, each having seven months of data (from 1st of May to 30th November) from each year between 1998 and 2012 reported against 1/05/1998 to 30/11/1998.

In Figure 13, the sub-run names indicate each replicate with seven months of data. The right side of the figure shows that three replicates with data (from 1st May to 30th November) corresponding to start cycle dates as 1998, 1999 and 2000 are reported against 01 May 1998 and 30 Nov 1998.

Figure 13. Replicate run results showing 15 replicates with partial year data cycled between 1/05/1998 and 30/11/1998

It should be noted that the number of replicates should be less than or equal to the number of years of data in the data source.

The below table (Table 6) shows how the replicate analysis works in the above example.

Table 6. Reported date and the corresponding original data dates

Reported date	1998 (Replicate 1)	1999 (Replicate 2)	2000 (Replicate 3)	...	...	2012 (Replicate 15)
Reported starting date	Original data starting date	Original data starting date	Original data starting date	...	...	Original data starting date
1/05/1998 - 30/11/1998	1/05/1998 - 30/11/1998	1/05/1999 - 30/11/1999	1/05/2000 - 30/11/2000	...	...	1/05/2012 - 30/11/2012

Source User Guide 5.50

Replicate Analysis