NAMC Water Quality Report Contents

All our water chemistry reports include the following columns. If further explanation is needed, please do not hesitate to ask us.

SampleId – NAMC unique sample identifier
Customer submitted visitId – unique sample identifier submitted by the customer
Predicted total nitrogen - the predicted TN value in ug/L.
Predicted total phosphorus - the predicted TP value in ug/L.
Predicted electrical conductivity- the predicted EC value in uS/cm.
Model applicability for each predicted WQ attribute and sample.
StreamCat variables used to assess model applicability

Considerations for Index Interpretation

Condition Determination

We recommend adding the 95th and 75th percentiles of model error to predictions to determine a benchmark to compare observed field values to. Model error is provided in the model metadata.

Model Applicability

Care should be taken to ensure the environmental conditions of sample sites are like those of the reference conditions used to develop indices. To assist with this, NAMC includes a “Model Applicability” column which determines whether the sample site's environmental conditions are within the range of experience of the model. A “fail” indicates the model had to extrapolate, rather than interpolate when making predictions. Fails indicate that the stream was an outlier (90th percentile) in environmental space compared to the reference sites used to build the index. Index scores should be interpreted cautiously if a site failed the test for range of experience of the model.

How does NAMC determine model applicability?

We developed a nearest-neighbor based approach to identify sites with environmental characteristics that were outside of the environmental space defined by the reference sites. For WQ indices, we plotted important predictors in each model using a principle components analysis (PCA) with varimax rotation. We then selected the top 3-4 predictors that were least correlated for each model. These variables were then used to characterize general environmental space for reference sites for each variable and compare new sites to. We standardized all representative variables by scaling between the minimum and maximum values observed in the reference data. We used the standardized variables to calculate Euclidean multivariate distances between each reference site and all other reference sites. We then calculated the average distance of each reference site to the 10 nearest other reference sites and used the 90th percentile of this distribution as a threshold for defining if a new site was out-side of reference site environmental space. To apply this test to new sites, we calculated the average distance of each new site to the 10 nearest reference sites and flagged a new site as an outlier if the average distance exceeded the 90th percentile threshold defined by the distribution of 10-nearest neighbor reference site distances. Boxplot distributions of the 4 environmental gradients above at reference sites used to build the model can then be compared with individual sites environmental gradients to determine why sites were considered outliers. These boxplots are provided on each individual index page.