Data file formats

This section provides an overview of the file formats supported by Source. Table 5 lists the supported time-series data file formats. Raster data file formats are listed in Table 6. Several GIS, graphics and other formats that are also recognised by Source are listed in Table 7 but are not otherwise described in this guide. Click on the link associated with each file extension to go directly to information about that time series.

Note: Formats with the ** symbol are part of the GDAL raster formats. A complete list of these is provided here.

Table 5. Text-based time-series data file formats

File extension	Description
.AR1	Annual stochastic time series
.AWB	AWBM daily time series
.BSB	SWAT BSB time series
.BSM	BoM 6 minute time series
.CDT	Comma delimited time series
.CSV	Comma-separated value
.DAT	F.Chiew time series
.IQQM	IQQM time series
.MRF	MFM monthly rainfall files
.PCP	SWAT daily time series
.SDT	Space delimited time series
.SILO5	SILO 5 time series
.SILO8	SILO 8 time series
.TTS	Tarsier daily time series

Table 6. Text-based raster data file formats

File extension	Description
.ASC**	ESRI ASCII grids
.MWASC	Map window ASCII grids
.TAPESG	Grid-based Terrain Analysis Data

Table 7. Other supported file formats

File extension	Description
.FLT	ESRI Binary Raster Interchange format
.JPG	GEO JPG Image (also .JPEG), and must have an associated .jgw world file
.MIF	MapInfo Interchange
.SHP**	ESRI Shape files
.TIF**	GeoTIFF Image (also .TIFF)
.TILE	Tiled Raster Files
.TNE	Tarsier Node Link Network Files
.TRA	Tarsier Raster Files
.TSD	Tarsier Sites Data Files
.ADF**	ArcINFO/ESRI Binary Grid
.IMG**	ERDAS Imagine

Annual stochastic time series

The .AR1 format contains replicates of annual time-series data generated using the AR(1) stochastic method. The file format is shown in Table 8. This format is not the same as the AR(1) format (.GEN) generated and exported by the Stochastic Climate Library.

Table 8. AR1 data file format

Row	Column (space-separated)
Row	1	2	3..nypr
1	desc
2	nypr	nr
odd	rn
even	value	value	value

where:

desc is a title describing the collection site

nypr is the number of years per replicate

nr is the number of replicates

rn is the replicate number in the range 1..nr

value is one of the nypr data points per row for the replicate, to three decimal places.

ESRI ASCII grids

The .ASC format is a space delimited grid file, with a 6 line header as shown in Table 9. Values are not case sensitive and arranged in space delimited rows and columns, reflecting the structure of the grid. Units for cell size length depend on the input data, and could be either geographic (eg degrees) or projected (eg metres, kilometres). Units are generally determined by the application, with metres (m) being common for most TIME-based applications. For a file format description, refer to:

http://resources.esri.com/help/9.3/arcgisengine/com_cpp/gp_toolref/spatial_analyst_tools/esri_ascii_raster_format.htm

Arcinfo grid coverages can be converted to .ASC files using ESRI’s GRIDASCII command. ASC files can be imported into ArcGIS using the ASCIIGRID command.

Table 9. .ASC data file format

Row	Column (space-delimited)
Row	1	2	3..n
1	ncols	nc
2	nrows	nr
3	xref	x
4	yref	y
5	cellsize	size
6	nodata_value	sentinel
7..n	value	value	value

where:

nc is the number of columns

nr is the number of rows

xref is either XLLCENTER (centre of the grid) or XLLCORNER (lower left corner of grid)

yref is either YLLCENTER (centre of the grid) or YLLCORNER (lower left corner of grid)

(x,y) are the coordinates of the origin (by centre or lower left corner of the grid)

size is the cell side length

sentinel is a null data string (eg -9999)

value is a data point. There should be nc × nr data points.

AWBM daily time series

An AWBM daily time-series format file (.AWB) is an ASCII text file containing daily time-series data formatted as shown in Table 10. Dates (the year and month) were optional in the original AWBM file format, but are not optional in the format used in Source.

Table 10. AWB data file format

Row	Column (space-separated)
Row	1	2..ndays+1	ndays+2	ndays+3
1..n	ndays	value	year	month

where:

ndays is the number of days in the month (28..31)

value is the data point corresponding with a given day in the month (ie. ndays columns)

year is the year of observation (four digits)

month is the month of observation (one or two digits).

SWAT BSB time series

A .BSB is a line-based fixed-format file, typically used by applications written in FORTRAN. The header line gives the fields for the file with subsequent lines providing data for each basin to be used for each time-step. The format is shown in Table 11. For more details refer to the SWAT manual.

Table 11. .BSB data file format

Row	Character positions (space added)
Row	1..8	10..12	14..21	23..36	38..46
1	SUB	GIS	MON	AREAkm2	PRECIPmm
2..n	id	gis	mon	area	precip

where:

id is the basin identifier (both SUB and the id are text, left-aligned)

gis is the GIS value (integer, right-aligned, eg. "1")

month is the month of observation (integer, right-aligned, eg. "0")

area is the basin area in square kilometers (real, right aligned, eg "1.14170E+02")

precip is the basin precipitation in millimetres (real, right aligned, eg "1.2000").

BOM 6 minute time series

A .BSM (also .PLUV) is a fixed-format file, typically supplied by the Australian Bureau of Meteorology for 6 minute pluviograph data. The file has two header lines (record types 1 and 2) followed by an arbitrary number of records of type 3. The formats of record types 1..3 are shown in Table 12, Table 13 and Table 14, respectively.

All fields in .BSM files use fixed spacing when supplied, but Source can also read spaced-separated values.

Rainfall data points:

Each row of data contains all of the observations for that day;
The number of observations for a day depends on the observation interval. For example, if the observation interval is 6 minutes, there will be 24×60÷6=240 observations (raini fields) in each row of data;
Each rain field is in FORTRAN format F7.1 (a field width of seven bytes with one decimal place);
Assuming that observations are numbered from 1..n, the starting column position of any given raini field can be computed from 14+7×i;
The unit of measurement is tenths of a millimetre (eg. a rainfall of 2 mm will be encoded as "20.0").
Values are interpreted as follows:
- 0.0 means there was no rain during the interval.
- a positive non-zero value is the observed rainfall, in tenths of a millimetre, during the interval.
- If there is zero rain for the whole day, no record is written for that day.

Missing data:

A sentinel value of -9999.0 means that no data is available for that interval;
A sentinel value of -8888.0 means that rain may have fallen during the interval but the total is known only for a period of several intervals. This total is entered as a negative value in the last interval of the accumulated period. For example, the following the following pattern would show that a total of 2 millimetres of rain fell at some time during an 18-minute period: -8888.0-8888.0 -20.0
If an entire month of data is missing, either no records are written or days filled with missing values (-9999.0) are written. No attempt is made to write dummy records if complete years of data are missing.

Example file

61078 1

61078 2 WILLIAMTOWN RAAF

61078 19521231 .0 .0 .0 [etc., 240 values]

61078 1953 1 1 .0 .0 .0 [etc., 240 values]

61078 1953 1 3 .0 .2 .0 [etc., 240 values]

61078 1953 115 .0 .0 .2 [etc., 240 values]

61078 1953 118 .0 .0 .0 [etc., 240 values]

61078 1953 212 .0 .0 .0 [etc., 240 values]

61078 1953 213 .0 .0 .0 [etc., 240 values]

61078 1953 214 .0 .0 .0 [etc., 240 values]

61078 19521231 .0 .0 .0 [etc., 240 values]

61078 19521231 .0 .0 .0 [etc., 240 values]

The following notes are taken from the Bureau of Meteorology advice:

All data available in the computer archive are provided. However very few sites have uninterrupted historical record, with no gaps. Such gaps or missing data may be due to many reasons from illness of the observer to a broken instrument. A site may have been closed, reopened, upgraded or downgraded during its existence, possibly causing breaks in the record of any particular element.
Final quality control for any element usually occurs once the manuscript records have been received and processed, which may be 6-12 weeks after the end of the month. Thus quality-controlled data will not normally be available immediately, in "real time".

Table 12. .BSM data file format (record type 1)

Row	Character positions (space padded)
Row	1..16	7..15	16	17..n
1..n	snum	blank	1	blank

where:

snum is the station number

blank ASCII space characters

Table 13. .BSM data file format (record type 2)

Row	Character positions (space padded)
Row	1..6	7..12	13..16	17..18	19..20	21..n
1..n	snum	blank	year	month	day	{rain_i...}

where:

snum is the station number

year is the year of the observation (four digits)

month is the month of the observation (one or two digits, right-aligned, space padded)

day is the date of the observation (one or two digits, right-aligned, space padded)

rain_i is a rainfall data point as explained below.

Comma delimited time series

A .CDT comma delimited time-series format file is an ASCII text file that contains regular (periodic) time-series data. The file type commonly has no header line but, if required, it can support a single line header of "Date,Time series 1".

You can use the .CDT format to associate observations with a variety of time interval specifications. Table 15 shows how to structure annual data, Table 16 how to specify daily data aggregated at the monthly level, and Table 17 the more traditional daily time series (one date, one observation). Table 18 explains how to supply data in six-minute format.

Table 15. .CDT data file format (annual time series)

Row	Column (comma-separated)
Row	1	2
1..n	year	value..n

where:

year is the year of observation (four digits, eg. 2011)

value is the observed value (eg. 9876).

Table 16. .CDT data file format (time series with monthly data)

Row	Column (comma-separated)
Row	1	2
1..n	mm/yyyy	value

where:

mm is the month of observation (two digits, eg. 09)

yyyy is the year of observation (four digits, eg. 2011)

value is the observed value (eg. 2600).

Table 17. .CDT data file format (daily time series with daily data)

Row	Column (comma-separated)
Row	1	2
1..n	date	value

where:

date is the date of observation in ISO format (eg. 2000-12-31)

value is the observed value (eg. 2600).

Table 18. .CDT data file format (six-minute time series)

Row	Column (comma-separated)
Row	1	2	3..n
1..n	date	time	value

where:

date is the date of observation in ISO format (eg. 2000-12-31)

time is the time of observation in hours and minutes (eg 23:48)

value is the observed value (eg. 10).

Comma-separated value

A comma separated value or .CSV file is an ASCII text file that contains data in a variety of representations. When a .CSV contains regular (periodic) time-series data, there are at least two columns of data. The first contains a time-stamp and the remaining columns contain data points associated with the time-stamp. The format is shown in Table 19. All columns are separated using commas. Annual data can be entered using the notation 01/yyyy, where yyyy is a year. Header lines in .CSV files are usually optional.

Table 19. .CSV data file format

Row	Column (comma-separated)
Row	1	2..n
1	Date	desc
2..n	date	value

where:

desc is a title for the column (header rows are often optional)

date is a date in ISO 8601 format ("yyyy-MM-dd HH:mm:ss" where " HH:mm:ss" is optional)

value is a data point (eg a real number with one decimal place)

F.Chiew time series

A .DAT is a two-column daily time-series file with the fixed format shown in Table 20. Note that the first two characters in each line are always spaces with the data starting at the third character position.

Table 20. .DAT data file format

Row	Character positions (space padded)
Row	1..2	3..6	7..8	9..10	12..20
1..n	blank	year	month	day	value

where:

blank is ASCII space characters

year is the year of the observation (four digits)

month is the month of the observation (one or two digits, right-aligned, space padded)

day is the date of the observation (one or two digits, right-aligned, space padded)

value is the data point (real, two decimal places, right aligned, eg "1.20").

IQQM time series

An .IQQM time-series format file is an ASCII text file that contains daily, monthly or annual time-series data. The file has a five line header formatted as shown in Table 21. The header is followed by as many tables as are needed to describe the range delimited by fdate..ldate. The format of each table is shown in Table 22.

Each value is right-justified in 7 character positions with one leading space and one trailing quality indicator. In other words, there are five character positions for digits which are space-filled and right-aligned. The first value in each row (ie the observation for the first day of the month) occupies character positions 5..11. The second value occupies character positions 12..18, the third value positions 19..25, and so on across the row. In months with 31 days, the final value occupies character positions 215..221. The character positions corresponding with non-existent days in a given month are entirely blank. The mtotal and ytotal fields can support up to 8 digits. Both are space-filled, right-aligned in character positions 223..230.

The quality indicators defined by IQQM are summarised in Table 23. At present, Source does not act on these quality indicators.

Missing data points are generally represented as "-1?". A value is also considered to be a missing data point if it is expressed as a negative number and is not followed by either an "n" or "N" quality indicator.

Divider lines consist of ASCII hyphens (0x2D), beginning in character position 5 and ending at position 231.

Example file

Title: Meaningful title Date:06/08/2001 Time:11:38:25.51
Site : Dead Politically Correct Person's Creek
Type : Flow
Units: ML/d
Date : 01/01/1898 to 30/06/1998   Interval : Daily
Year:1898
     ------------------------------------ ------------------------------------
       01   02   03    04   05   06  ...  28   29   30   31   Total
     ------------------------------------ ------------------------------------
Jan    3     4    3     4   3   4      2   3     2     3     224
Feb    2     3     2     3   2   3      2              134
Mar    3     22   4     2   2   2      1   2     1     2     84
Apr    1     2     1     2   1   2      1   1     1      37
May    1     1     4     3    53     33       1   1     1     1     143
Jun    1     1     0     1   -1?   7      63   58   52      816
Jul    48   43   40   36   33     30       77   70   63    59      1389
Aug    54   49   46   41   39     35       30   28   26    420     2433
Sep   880   362   282   256  245     215      241  39   36        4414
Oct    35    33    31    31   29     28       22   28   20    17      783
Nov    15    16   15    18   16     15       11   12   11        415
Dec    12    11   11    11   11     10       9     8   9     8     422
----------------------------------------- ------------------------------------
    11294

Row	Character range	Key	Character range	Value
1	1..6	Title:	8..47	title
	54..58	Date:	59..68	cdate
	71..75	Time:	76..86	ctime
2	1..6	Site:	8..47	site
3	1..6	Type:	8..22	type
4	1..6	Units:	8..17	units
5	1..6	Date:	8..17	fdate
	19..20	to	22..31	ldate
	36..45	Interval:	47..n	interval
6	<<blank line>>

where:

title is a string describing the file’s contents

cdate is the date on which the time series was created (dd/mm/yyyy)

ctime is the time on cdate when the time series was created (hh:mm:ss.ms)

site is a string describing the measurement site

type is a string specifying the data type (eg. precipitation, evaporation, gauged flow)

units is a string specifying the units of data (eg. mm, mm*0.1, ML/day)

fdate is the first date in the time series (dd/mm/yyyy)

ldate is the last date in the time series (dd/mm/yyyy)

interval is a string defining the collection interval (eg. daily, monthly)

Table 22. IQQM data file format (table)

Row	Logical column (fixed width)
Row	1	2..13	14
+0	Year: year Factor= factor
+1	<<divider line>>
+2		dd	Total
+3	<<divider line>>
+4.. +15	mmm	value	mtotal
+16	<<divider line>>
+17			ytotal
+18	<<divider line>>

where:

year defines the year implied for the following table (yyyy)

factor (if present) each value in the table is multiplied by factor (if omitted, the default is 1.0)

dd is the day of the month from 01..31 (zero-padded)

mmm is the first three characters of the name of the month (eg. Jan, Feb)

value is a data point. There should be as many data points in the row as the month has days

mtotal is the sum of the daily values in the month

ytotal is the sum of the monthly values in the year.

Table 23. IQQM data file format (quality indicators)

Character	Interpretation
" " (space)	Accept value as is
*	Multiply value by +1,000.0
e	The value is only an estimate
E	The value is only an estimate but it should be multiplied by 1,000
n	Multiply value by -1.0
N	Multiply value by -1,000.0
?	Missing data indication (typically input as "-1?")

MFM monthly rainfall files

A .MRF text file format contains a header line followed by a line giving the number of years of data. Data are formatted in lines with year given first, followed by 12 monthly values, all space separated. The format is shown in Table 24.

Table 24. .MRF data file format

Row	Column (space-delimited)
Row	1	2..13
1	desc
2	nyears
3..n	years	mvalue

where:

desc is a string describing the file’s contents (eg "Swiftflow River @ Wooden Bridge")

nyears states the number of years (rows) of data in the file

year is the year of observation (four digits)

mvalue is a data point. Each year should have 12 data points in the order January...December.

Map window ASCII grids

The .MWASC ASCII grid is similar to .ASC except that the coordinates are offset by 1/2 cell size and the header rows do not have titles. Thus there are six header rows with parameters only, followed by the gridded data. The format is shown in Table 25.

Table 25. MWASC data file format

Row	Column (space-delimited)
Row	1	2..n
1	nc
2	nr
3	xc
4	yc
5	size
6	sentinel
7..n	value	value

where:

nc is the number of columns

nr is the number of rows

(xc,yc) are the coordinates of the center of the call at the lower left corner of the grid

size is the cell side length

sentinel is a null data string (eg. -9999)

value is a data point. There should be nc × nr data points.

SWAT daily time series

A SWAT daily rainfall time-series format file (.PCP) is an ASCII text file that contains daily time-series rainfall data. The file has a four line header followed by daily data values as shown in Table 26.

Table 26. .PCP data file format

Row	Column (space-delimited)
Row	1	2
1	desc
2	Lati	lat
3	Long	lon
4	Elev	mahd
5..n	yyyydddvvv.v

where:

desc is a string describing the file’s contents (eg. "Precipitation Input File")

lat is the latitude of the site in degrees (eg 14.77)

lon is the longitude of the site in degrees (eg 102.7)

mahd is the elevation of the site in metres (eg 167)

yyyy is the year

ddd is the Julian day offset within the year

vvv.v is the data value expressed as four digits with one decimal place.

Space delimited time series

A space- or tab-delimited (.SDT) column time-series format file is an ASCII text file that contains time-series data. There is no header line in the file. The format is shown in Table 27. Monthly and annual data can be entered using month and/or day number as 01. These files can be created in a spreadsheet application by saving correctly formatted columns to a text (.TXT) format.

Table 27. .SDT data file format

Row	Column (space-delimited)
Row	1	2	3	4
1..n	year	month	day	value

where:

year is the year of observation (four digits)

month is the month of observation (one or two digits)

day is the day of observation (one or two digits)

value is the data value to three decimal places (eg. 14.000).

SILO 5 time series

A QDNR .SILO5 daily time-series format file is an ASCII text file that contains daily time-series data. The format is shown in Table 28. This format sometimes uses the .TXT file extension.

Table 28. SILO 5 data file format

Row	Column (space-delimited)
Row	1	2	3	4	5
1..n	year	month	day	jday	value

where:

year is the year of observation (four digits)

month is the month of observation (one or two digits)

day is the day of observation (one or two digits)

jday is the Julian day offset within the year (one, two or three digits)

value is a data point.

SILO 8 time series

The .SILO8 format contains the full 8 column daily data set from the SILO data base. The file can have multiple header lines, enclosed in inverted commas. The format of data rows is shown in Table 29.

Table 29. SILO 8 data file format

Row	Column (space-delimited)
Row	1	2	3	4	5	6	7	8
1..n	maxt	mint	rain	evap	rad	vpress	maxrh	minrh

where:

maxt is the maximum temperature

mint is the minimum temperature

rain is the rainfall

evap is the evaporation

rad is the radiation

vpress is the vapour pressure

maxrh is the maximum relative humidity

minrh is the minimum relative humidity.

Grid-based Terrain Analysis Data

A .TAPESG file is a three column raster data format, with space separated values. Each line consists of the X coordinate, Y coordinate, and value. The format is shown in Table 30.

Table 30. .TAPESG data file format

Row	Column (space-delimited)
Row	1	2	3..n
1	x	y	value

where:

(x,y) are coordinates

value is a data point.

Tarsier daily time series

The Tarsier daily time-series format file (.TTS) is an ASCII text file that contains daily time-series data. The file has a 21-line header (Table 64) followed by daily data values in the format shown in Table 31.

Table 31.

Data file formats

Table 5. Text-based time-series data file formats

Table 6. Text-based raster data file formats

Table 7. Other supported file formats

Annual stochastic time series

Table 8. AR1 data file format

ESRI ASCII grids

Table 9. .ASC data file format

AWBM daily time series

Table 10. AWB data file format

SWAT BSB time series

Table 11. .BSB data file format

BOM 6 minute time series

Example file

Table 12. .BSM data file format (record type 1)

Table 13. .BSM data file format (record type 2)

Comma delimited time series

Table 15. .CDT data file format (annual time series)

Table 16. .CDT data file format (time series with monthly data)

Table 17. .CDT data file format (daily time series with daily data)

Table 18. .CDT data file format (six-minute time series)

Comma-separated value

F.Chiew time series

Table 20. .DAT data file format

IQQM time series

Example file

Table 21. IQQM data file format (header)

Table 22. IQQM data file format (table)

Table 23. IQQM data file format (quality indicators)

MFM monthly rainfall files

Table 24. .MRF data file format

Map window ASCII grids

Table 25. MWASC data file format

SWAT daily time series

Table 26. .PCP data file format

Space delimited time series

Table 27. .SDT data file format

SILO 5 time series

Table 28. SILO 5 data file format

SILO 8 time series

Table 29. SILO 8 data file format

Grid-based Terrain Analysis Data

Table 30. .TAPESG data file format

Tarsier daily time series