^{1}

^{1}

^{2}

^{1}IDAEA, CSIC, Jordi Girona 18-26, Barcelona 08034, Spain.

^{2}ICRA-CERCA, Catalan Institute for Water Research, Scientific and Technologic Park of the UdG, Girona 17003, Spain.

Our knowledge of the river’s qualitative status generally relies on discrete spatial and temporal observations organized under what is commonly known as a “monitoring network”. Network performance is constrained by its spatial - temporal resolution, which is severely limited by the costs associated with the whole sampling and analytical process. Alternatively, modeling allows predicting the spatial - temporal variable profile at any resolution at affordable computing costs. However, it involves high uncertainty in the parameterization and requires experimental validation as well. Here, we aimed at reconciling monitoring and modeling, deriving simple steady-state advection-reaction (reactive-transport) models from monitoring data. They are based on graph-theoretical concepts, notably the use of the Laplacian matrix, which captures the river network topology, the interaction between adjacent sites, and the advection process between them. The local reactive process is described by a first-order decay reaction. The application of these models provided relevant information about the variables monitored, such as the local dynamics, the distance of the site’s influence, the degree of synchronization, or the external input/output to the system, which is useful for both scientific and management purposes.

The model was tested in the Llobregat River (NE Spain) basin, with 70 emerging contaminants of different classes (pharmaceuticals, pesticides, perfluorinated substances, endocrine disruptors, and drugs of abuse). The monitoring network included 14 sites (7 in the mainstream, 4 in the Cardener, and 3 in the Anoia tributaries) and was monitored in 2 campaigns. These models can help water managers to optimize the design of river monitoring networks, a key aspect of environmental regulations.

Rivers are net receivers of both point and diffuse pollution, such as nutrients, metals, emerging pollutants, ^{[1]}, which are considered one of the main causes of freshwater biodiversity impairment^{[2,3]}. Many chemical pollutants are not environmentally persistent; rather, they change due to multiple biotic (biodegradation) and abiotic processes (sorption, photolysis, hydrolysis, ^{[4]}. Therefore, our knowledge of the river’s qualitative status relies on discrete spatial and temporal observations of a set of physical, chemical, or biological parameters, organized under what is commonly known as a “monitoring network” [^{[5]}. While both approaches - monitoring and modeling - have their respective pros and cons, there is a growing interest in the latter due to the increasing development and affordability of computation and information techniques compared to costly monitoring campaigns^{[5,6]}. Modeling approaches have been mostly focused on the prediction of environmental concentrations of pollutants. Existing models include GREAT-ER^{[7,8]}, PhATE^{[9]}, LF2000-WQX^{[10,11]}, and MERLIN-EXPO^{[12]}, among others^{[13-16]}. However, advanced modeling is not exempt from limitations. They are “data-hungry” and require combining a hydrological model with that of the physical - chemical processes taking place in the river, whose parameterization is often subjected to high uncertainty^{[17]} that depends on local conditions^{[18-20]}. Moreover, they ultimately need to be validated through experimental measurements. Altogether, monitoring and modeling must be regarded as complementary tools that should be advantageously used together^{[5,21]}.

(A) Map of the Llobregat River basin showing the sampling sites. (B) Network graph representation of the area studied. The distances (km) between adjacent sites are indicated in blue.

Although the use of spatial (topological) relationships in the study of rivers is not new^{[21-27]}, here, we explore new possibilities for exploitation of experimental data available from river monitoring networks, and their interpretation in the light of simple reactive-transport (or advection-reaction) that can be readily derived therefrom. To do so, we make use of some graph-theoretical concepts recently used in the development of models (oscillation models) applied in other domains such as social networks (Internet networks, ^{[28,29]}.

The proposed advection-reaction model approach was tested using an available dataset that includes concentrations of ^{[30]}.

The Llobregat River (NE Spain)^{[31]} is 156 km long and covers a catchment area of about 4957 km^{2 }[^{3}, and it has an annual average discharge of 693 Hm^{3}. Its watershed is heavily populated with more than three million inhabitants, mostly in the lowest part located in the surroundings of the city of Barcelona. Together with its two main tributaries, River Cardener and River Anoia, the Llobregat is subjected to heavy anthropogenic pressure, receiving extensive urban and industrial wastewater discharges (137 Hm^{3}/year; 92% coming from the wastewater treatment plants), which constitutes a significant part of its natural flow. Overall, 48% of these point sources are located in the studied area. Furthermore, in the middle part of the basin, it receives brine leachates from natural salt formations and mining operations, which have caused an increase in water salinity downstream. The Llobregat is thus an illustrative example of overexploited river.

The river studied can be described as a set of spatially distributed connected sites (nodes) in which we carry out measurements of a variable ^{[28,29] }by a graph

We define the degree _{i}

where

The so-called graph

The elements of L are given by:

The Laplacian matrix _{1} = 0 the smallest one and its associated eigenvector _{1} ^{T}. This vector corresponds to the fully synchronized state, in which the variable studied _{1}=x_{2}=…x_{i}…= x_{n-1}= x_{n}

The model provides a simple and general description capturing the network dynamics. Briefly, let the state of node _{i}_{1}, .... , x_{n}^{T} is the _{i}

where the first term on the right side of the equation reflects a first-order decay process with a rate constant _{i} are local inputs or sinks at each site

It can be written in the following compact vector form:

Assuming that the measurements correspond to a stationary state (

Numerically, the parameter

It can be shown that the inverse of the OLS calculated slope

Noting that the right-hand expression in the above equation is the Rayleigh quotient of the Laplacian matrix

Furthermore, expanding _{i}_{i}^{T}_{i} to

with

Entropy _{1} = 0, which quantifies the weight of the synchronized state (note, however, that the contribution of the synchronized state to _{1} = 0).

Without loss of generality, the foregoing definitions of the adjacency and Laplacian matrices can be extended to “weighted” analogs, i.e., _{ij}_{ij}_{ij}_{ij}_{ij}_{ij }^{−1}). Defined in this way, it is worth noting that the terms _{i }- x_{j})/r_{ij}

All the calculations were performed using Excel (Microdsoft®) and the R environment (version 4.1.0). Measurements below the limit of detection (< LOD) and the limit of quantification (< LOQ) were set equal to 0 for calculation purposes.

The sampling of water for chemical characterization was performed at the end of the summer period in campaigns carried out in autumn for two consecutive years, namely 2010 (C1) and 2011 (C2). Sampling sites (14) were distributed in the Llobregat mainstream (seven sites), and the Cardener and Anoia tributaries (four and three sites, respectively) [^{[32]}, pharmaceuticals^{[33]}, endocrine-disrupting chemicals (EDCs) and related compounds (hormones, plasticizers, alkylphenols, parabens, phosphate flame retardants, anticorrosion agents, and bactericides)^{[34]}, perfluorinated compounds^{[35,36]}, and drugs of abuse^{[37]}.

To avoid an excessive number of “0 values” (i.e., not detected or not quantified), the above-described methods were applied to compounds having detection frequencies equal to or greater than 20% in each of the two sampling campaigns performed, which resulted in a final selection of 71 compounds out of the 199 monitored [

Monitoring information (limits of detection, detection frequency, and mean and maximum concentrations) corresponding to the selected compounds and the two campaigns are collected in ^{[32-37]} and their associated ecotoxicological risk^{[38,39]}, as well as their relationship with the receiving ecosystems^{[1,40,41]}, was previously published.

The application of the above-described advection-reaction model (Section “Study area”) to each of the 71 compounds selected provided as main quantitative outputs the following information: (a) characteristic distances (ℓ) [Equation (3)]; (b) entropy [Equation (6)]; (c) local input/outputs [Equation (3)]; and (d) the contribution of the different eigenstates [Equation (5)]. Although these quantitative indicators are not fully independent and exhibit some quantitative relationships (see below), they provide somewhat complementary information useful for different interpretation purposes. Whereas the characteristic length (ℓ = ^{T}Lx/x^{T}x, Equation (4)] is often referred to in graph theory as the “network energy”. Finally, its spectral decomposition in terms of the eigenvectors and eigenvalues of

Boxplots showing the distribution of the values per sampling campaign (C1, 2010; C2, 2011) for the different compound classes: (A) characteristic length (km); (B) entropy; and (C) synchronized state contribution. The upper and lower bounds correspond to maximum and minimum values; boxes are limited by the 75th and 25th quartiles; and dots correspond to outliers.

Entropy values for the different compound classes are given in ^{[42]}.

As mentioned above, the three properties studied exhibit related trends, which are depicted in

Scatter plots showing the relationships between characteristic length and entropy as well as synchronized state of all compounds studied in the two campaigns (

Further analysis of the processes involved can be achieved considering the respective contributions of the different eigenstates of the Laplacian matrix L to Equations (5) and (6). Specifically, the contribution of the synchronized state (i.e., equal values of measured pollutant

Value distribution per compound along the two sampling campaigns monitored (C1, 2010; C2, 2011): (A) characteristic length (km); (B) entropy; and (C) contribution of the synchronized state (%).

Contrastingly, entropy values were generally higher in 2011 (C2) than in 2010 (C1) [

Relevant information regarding the input/output pollution load throughout the basin can be obtained from the vector ε = δ/

Net input/output per site along the two sampling campaigns monitored (C1, 2010; C2, 2011) at each sampling site of: (A) 1H-benzotriazole (anticorrosion agent, industrial use); (B) chlorpyriphos (insecticide, agriculture use); (C) diclofenac (pharmaceutical, urban use); and (D) total pollution load (expressed as the sum of all pollutants analyzed).

River basin quality status assessment is largely determined through discrete measurements of different variables carried out at certain sites located alongside the catchment and specified time frequencies, under what is usually known as a “monitoring network”. These are currently used for both research and water management purposes and have given rise to large datasets. Network performance is constrained by its spatial-temporal resolution, which, in turn, is severely limited by the, often expensive, costs associated with the whole sampling and analytical process, as in the case of emerging contaminants. Alternatively, modeling allows predicting the spatial-temporal variable profile at any resolution at an affordable computing cost. However, it involves high uncertainty in the parameterization and requires, in the end, experimental validation.

In the present short contribution, we aimed at reconciling both approaches - monitoring and modeling - through the interpretation of the experimental data under the framework of a simple advection-reaction (i.e., reactive-transport) model based on graph-theoretical concepts, notably the use of the Laplacian matrix, which adequately captures the river network topology and the interaction between adjacent sites. Whereas the Laplacian matrix has been previously used in spatial geographic analysis (e.g., the so-called “Geary coefficient”^{[43]}, a widely used spatial correlation index, can be expressed in terms of the Laplacian matrix^{[44]}), here we used it for the first time in two novel aspects: (a) to capture the advection process taking place between adjacent sites in the context of an advection-reaction model; and (b) the definition of an entropy that quantifies the contribution of the different eigenstates (modes) of the Laplacian to the measured variable.

Despite their obvious limitations, the approach presented here is straightforward and can provide useful information regarding the dynamics of the variables monitored, such as the local dynamics, the influence of the neighbor sites, the degree of synchronization, or the external input/output to the system. Even though the methods and results presented in this study were specifically concerned with emerging contaminants, they can be easily extended to any other site-measured variables as well, such as nutrients, inorganic constituents, heavy metals, or even biological parameters such chlorophyll, dissolved enzymes (alkaline phosphatase),

Finally, since monitoring for water quality assessment is a key aspect in the implementation of environmental legislation, such as the Water Framework Directive (Directive 2000/60/EC), the presented modeling approach can provide valuable insights for water managers engaged in the proper and consistent design of river monitoring networks, whether surveillance, operational, or investigative, with potential practical consequences on the optimization of economic costs.

This work has been supported by the Spanish Ministry of Science and Innovation (MCIN).

Involved in the conception: Ginebreda A, Barceló D

Involved in the design, data treatment and interpretation: Ginebreda A

Writing of manuscript and critical reading: Ginebreda A, Barceló D

Data are available at the Supplementary Material [

This work has been financed by the Spanish Ministry of Science and Innovation (MCIN), through Project SINERGIA included in the Severo Ochoa Grant CEX2018-000794-S funded by MCIN/AEI/ 10.13039/501100011033.

All authors declared that there are no conflicts of interest.

Not applicable.

Not applicable.

© The Author(s) 2022