5 Integrated Assessment


5.1 Overview

The previous chapters have detailed the nature of the model setup, additions and areas of development. In this section, the model is assessed in its entirety, using the fully coupled model, against the complete dataset. The approach to assess the model loosely follows the CSPS framework of Hipsey et al. (2020). The framework considers:

  • Level 0: conceptual evaluation; the conceptual diagrams for each of the water quality variables, biogeochemical reactions, and habitat modelling are based on the scientific review and data inspection as detailed in the previous chapters;
  • Level 1: simulated state variables; a range of metrics is used for a large number of predicted variable and different sites.
  • Level 2: process rates; and
  • Level 3: system-level patterns and emergent properties; this is evaluated by the overall nutrient budget analysis, nutrient cycling pathway analysis, and assessment of the relationship between the areas of habitat for seagrass.

The specific data available for validation and the assessment metrics used are described next. The level of model uncertainty is discussed in terms of how much confidence is in the current generation of model outputs for the purposes of defining model reliability.

In addition to the model performance assessment, the model is also routinely used to support management decision-making. Detail is provided on how the diverse model outputs can be processed and analysed for informing the decision making and other applications.


## Model assessment approach

5.1.1 Summary of validation data-set

The field observation data available for the model validation and assessment include a diversity of historical data (collected pre 2021), and a large volume of data generated by recent monitoring and WAMSI-Westport research projects. Relevant data for validation include:

  • In situ water quality sensors; high frequency measurements at fixed locations.
  • Water quality grab samples
  • Biotic surveys
  • Strategic experimental data.

All the data relevant to model calibration and validation are included in the CSIEM Data Catalogue and detailed in Appendix A. The data spans a wide range of locations and time-periods; however the primary model assessment will focus the most intense period of monitoring. Long-term assessments are also undertaken for different versions of the model, as outlined next.

5.1.2 Performance assessment metrics

The modelling results are compared against historical data collected within Cockburn Sound (where available), using both traditional statistical metrics of model error, and other metrics relevant to model performance. The approach is applied to each model generation with the aim to identify areas where the model is accurate, and areas for further improvement and ongoing calibration effort.

Error metrics : Initially, the model performance in predicting a range of relevant variables including salinity, temperature, nitrogen, phosphorus and total chlorophyll-a are assessed with a set of statistical metrics, and the calculations of statistical metrics was performed for each observation site where the number of field observations was >10 in the assessment period.

The core statistical metrics considered consist of:

  • \(r\): regression coefficient, Varies between -1 and 1, with a score of 1 indicating the model varies perfectly with the observations and a negative score indicating the model varies inversely with the observations. A consistent bias may be present even when high score of r is obtained.
  • \(BIAS\): bias of average prediction to the average observation during the assessing period. This method presents a magnitude for the discrepancy between the model results and the observational data.
  • \(MAE\): mean absolute error: Similar to RMSE except absolute value is used. This reduces the bias towards large events. Values near zero indicate good model skill.
  • \(RMS\): root mean squared error, Measures the mean magnitude, but not direction, of the difference between model data and observations, and hence can be used to measure bias. Values near zero are desirable. This method is not affected by cancellation of negative and positive errors, but squaring the data may cause bias towards large events.
  • \(nash\): the Nash-Sutcliffe metric (also called \(NSE\) or \(MEF\) is a matrix of modelling efficiency, measures the mean magnitude of the difference between model data and observations. This method compares the performance of the model to that only uses the mean of the observed data. A value of 1 would indicate a perfect model, while a value of zero indicates performance similar to simply using the mean of observed data.

Seasonality : The model results are assessed in terms of the degree of seasonal fluctuation, as seen in the field data. Whilst this is captured in the error metrics (e.g. R, the visual assessment can assess timing issues related with seasonal peaks.

Transects : The model results are assessed in terms of the seasonal mean along the length of the domain (longitudinal transect). The transect analysis allows a system wide scale assessment of conditions, that smooths out noise and local variability in the field and model predictions.

Advanced measures : The model results were finally assessed considered that partitioning of nutrients in terms of inorganic vs organic, and other expected measures.

Confidence : Based on the above assessment we evaluate confidence in the model by assigning each variable to the following categories:

  • Good
  • Acceptable, and
  • Caution.

This confidence evaluation, considers:

  • Quality of observed data, which is influenced by field and laboratory data limitations, methodologies, processes and protocols.
  • Error metric scores relative to what is typically reported in the literature for water quality models (e.g., Arhonditsis and Brett, 2004).
  • Ability of the CSIEM to capture the mean of an indicator and its spatial gradient and seasonality.
  • Partitioning of water quality constituents within different ecosystem pools.
  • Natural variability of the indicator at different temporal scales (i.e. sub-daily to seasonal).


5.2 Assessment and validation periods

  • Historical period: 1970-2010: initial assessment prior to availability of WAMSI research project data;
  • Recent period: 2010-2020: initial assessment prior to availability of WAMSI research project data;
  • Focus period: 2021-2022: initial calibrated against the intensive field sampling and observations obtained from different components of the WAMSI research project;
  • Long-term performance: 2017 – 2022: calibrated against the long-term water quality data collected from the routine measurements, as well as from the WAMSI research project. These results are summarised below.


5.3 Integrated simulation performance

This integrated simulation performance will be provided in Appendix B which is pending and to be updated with the CSIEM modelling progress.