/
Replicate Runner

Note: This is documentation for version 5.16 of Source. For a different version of Source, select the relevant space by using the Spaces menu in the toolbar above

Replicate Runner

Overview

The Replicate Runner runs a single eWater Source model multiple times using an ensemble of time-series inputs generated using the Time Series Cycle Creator (TSCC).

eWater Source Project Setup

To run a Replicate job, the eWater Source project must already be created with the time-series to be cycled loaded in Data Sources. It is possible to cycle a single Data Source or multiple Data Sources.

In order to be cycled, the time-series Data Sources must be configured as follows:

  1. Reload on Run must be enabled (which instructs eWater Source to reload the data from a file on disk before simulating the model).
  2. The location of the data file must be specified as a Relative Path, and it must be in the same directory as the eWater Source project file, or a sub-directory of that directory.
  3. The data file must be in the csv file format.
  4. The data files must only contain data covering whole years otherwise cycled data will not match the correct day or month.

For time series that are not to be cycled, Reload on Run must be disabled and the time series files must not be included in the uploaded zip.

Once configured, the eWater Source project and the csv files to be cycled must be placed in a zip archive file.

Note that the project file must be in the root of the zip archive. It must not be within a folder. Do not included any files in the zip other than a single project file and the input csv files to be cycled. In particular do not include other csv files which are not timeseries files as the Time Series Cycle Creator tool will fail to cycle the data and the Job will fail.

Example Replicate Project zip file

ReplicateExample.zip

In this example there are two input timeseries files, Rainfall.csv and Demand.csv, with the latter file being in the Demand directory.

  • ReplicateExample.zip
    • ReplicateTestProject.rsproj
    • Rainfall.csv
    • Demand
      • Demand.csv


The Run Manager Cloud Application is currently configured to use multiple versions of Source and has two plugins available:

  • UrbanDeveloper
  • SubSource

Before running a project on Run Manager it is always recommended to save the Source project using a supported version to avoid needing to upgrade the project during a Source run.

Contents

Job Setup

Choose Project Zip

Click "Choose project zip" and select the .zip file that contains your Replicate project. Depending on the size of the file it may take a moment for the file to upload.

Once the project has finished being uploaded, the Replicate Runner configuration options will be displayed.

Source Project

Table 1 lists the eWater Source project configuration options

Table 1. eWater Source project configuration options.
Configuration OptionDescription
ScenarioChoose the eWater Source scenario to be run
Running Configuration

Choose which Source Running Configuration should be simulated.

Currently, the Run Manager is only designed to be used with Single Simulation.

Time Series Cycle Creator (TSCC)

The TSCC generates variants (called replicates) of an existing time series by repetitively 'cycling' the start of the time series to the end. Table 2 lists the TSCC parameters.

Table 2. TSCC parameters.
ParametersDescription
Number of ReplicatesThe number of replicates (output time-series) that will be created
Start Cycle Date

The first data date that will be used by the TSCC to generate the replicates.

This will default to the start date of the running configuration.

Start Reporting Year

The year that the replicates will be reported as starting in (the replicates will always use the same start day and month as the input time-series). The Start Reporting Year cannot be after the Source simulation start date, as this will cause the simulation to fail.

This will default to the year of the running configuration's start date.

End Reporting Year

The year that the replicates will be reported as ending in. The End Reporting Year cannot be before the Source simulation end date, as this will cause the simulation to fail.

This will default to the year of the running configuration's end date.


Example

Consider a daily time-series that spans the following 10 year period: 1/01/2000, 2/01/2000, 3/01/2000, ..., 31/12/2009. The data is used by a model that simulates a 4 year period from 01/07/2004 to 30/06/2008.

Running the TSCC with the following parameters would produce 4 replicates:

  • Number of Replicates = 4
  • Start Cycle Date = 01/01/2005
  • Start Reporting Year = 2004
  • End Reporting Year = 2008

The start date for all replicate timeseries files will be 01/01/2004 (the first day of Start Reporting Year) and the end date will be 31/12/2008 (the last day of End Reporting Year).

The values of the replicates will be taken from the input timeseries files corresponding to the original values as follows:

Replicate 1 - the time series values would correspond to the following years of the original timeseries files:

2005, 2006, 2007, 2008, 2009

The model would use the original 2005 to 2009 data for the simulation (and would report this as being 2004 to 2008).

Replicate 2 - the first year of replicate 1 (2000) would be moved to the end. Hence, the values of replicate 2 would correspond to the following years of the original timeseries files:

2006, 2007, 2008, 2009, 2000

The model would use the original 2006 to 2009 data, plus 2000 data (and would report this as being 2004 to 2008).

Replicate 3 - the first year of replicate 2 (2001) would be moved to the end. Hence, the values of replicate 3 would correspond to the following years of the original timeseries files:

2007, 2008, 2009, 2000, 2001

The model would use the original 2007 to 2009 data, plus the 2000 and 2001 data (and would report this as being 2004 to 2008).

Replicate 4 - the first year of replicate 3 (2002) would be moved to the end. Hence, the values of replicate 4 would correspond to the following years of the original timeseries files:

2008, 2009, 2000, 2001, 2002

The model would use the original 2008 and 2009 data, plus the 2000 to 2002 data (and would report this as being 2004 to 2008).

Care must be taken when including timeseries files to be cycled:

  • All timeseries files must cover the same dates otherwise cycling will result in different years being used for different files
  • Only full years of data can be cycled otherwise the cycled data will not match the correct month and day
  • Timeseries files must be in csv format
  • Do not include any additional csv files in the zip which are not used as inputs in the project, particularly non-timeseries files as the TSCC tool will not be able to cycle the data resulting in the job failing


Job Outputs

Choose which outputs the user would like to store and be able to access on completion of the job.

Queue

The Queue enables the user to configure the job details, estimate costs, and send the job to the queue. Table 3 lists the queue options.

Table 3. Queue options.

Configuration Option

Description

Job NameThe name given to the job when viewed in the Run Manager
Agent EndpointsChoose which type of agents should be used to run the job (will be hidden if Run Manager is configured to only have one type of agent endpoint)
Source VersionChoose the Source version to use for the Job
Number of AgentsChoose the maximum number of agents used to run the job
Model Run Estimate (Minutes)(Optional) The user's estimate of how long it will take to run the Source project once
Estimate(Optional) Estimates the costs of running the job based on the user's Model Run Estimate, the selected agents, and the job type.
QueueAdds the job to the queue of jobs to be run


Results

The Replicate Runner will provide the following result files in a zip:

  •  job.log - Log file containing information and messages generated during the Job
  • A folder for each replicate run containing:
    • results.res.csv - The Source results file for all recorders enabled in the Source project
    • Replicate input timeseries files used for the run in the same folder structure as provided in the original zip file