Opendata, web and dolomites

Report

Teaser, summary, work performed and final results

Periodic Reporting for period 1 - INNO-CYANO (INNOvative mapping, reporting and forecasting of CYANObacteria blooms applying the synergistic use of advanced in situ optical data, satellite imagery and statistical modelling)

Teaser

Cyanobacterial blooms occur worldwide and pose a serious threat to human health and natural environment. Cyanobacterial blooms monitoring is important for environmental agencies, water authorities, human and animal health organizations, since cyanobacteria cause a range of...

Summary

Cyanobacterial blooms occur worldwide and pose a serious threat to human health and natural environment. Cyanobacterial blooms monitoring is important for environmental agencies, water authorities, human and animal health organizations, since cyanobacteria cause a range of issues related with water quality and treatment problems. The main reason, why cyanobacterial blooms are hazardous to human and animal health is that about 60-80 of 300 blooms forming species of phytoplankton can produce toxins. There are various health issues associated with more than 60 identified toxins of cyanobacteria.

Blooms of non-toxic cyanobacteria may cause problems for a water body’s ecosystem. It is well known that the cyanobacterial surface scums can be very thick and cause oxygen depletion in the affected area. Also, extensive blooms of cyanobacteria can cause reduction of light penetration to the bottom, which will decrease densities of submerged aquatic vegetation. The effects are also economical: incidences of dying blooms washing upon beaches during the peak of the summer holiday season has frequently resulted in economic losses.

The management of surface water bodies requires appropriate monitoring. For the water quality, it is not so straight forward to set up an appropriate monitoring. There is a wide range of parameters which can be measured, and the variation over space and time needs to be considered when choosing measurement locations and frequencies, also the costs of equipment and method need to justify the returns.

Water quality is determined by a range of different properties such as physical-chemical parameters like nutrients and chemical compounds, physical properties like temperature and turbidity, and biological parameters like the presence and abundance of plants and microorganisms.

To be able to carry out efficient water management, users, such as water managers, require information over varying periods of time and varying measuring intervals. To gain the most insights in (ecological) processes taking place, the best is to combine data over a longer period (e.g. months or years to understand the historical situation) with spatial data (e.g. many sampling locations, or maps obtained from satellites) to understand the distribution of components. Furthermore, because water systems can be very dynamic, it is very important to also take high-frequency data into account to understand the dynamics, and to use the high frequency information as a basis for models to forecast future developments and trends.

Water Insight can already deliver high-frequency and spatial monitoring data. Their WISPstation instrument collects spectral data that can be translated into parameters such as: cyanobacteria pigments (as proxy for cyanobacteria biomass), Chlorophyll-a (as proxy for total biomass) and an index for the presence/absence of scum layers. To complete the set of monitoring services, the objective of INNO-CYANO was to develop an innovative, data-driven statistical model to provide fast forecasts of cyanobacteria blooms, especially scums.

Work performed

User requirements as discovered in earlier research projects and at meetings with users were listed. Generally, users want forecasts up to one week ahead for phytoplankton in total and cyanobacteria specifically. Users are interested in a daily warning system, because in calm weather conditions cyanobacterial blooms can occur and grow rapidly. The forecasts should be accurate on presence of blooms (or not) and the absolute values are less important than the accuracy with respect to false positive or negatives. Based on these requirements and based on knowledge of parameters that trigger cyanobacteria blooms the model specifications were listed. Main inputs would be earth observation data, optical in situ data and meteorological data. Outputs would be chlorophyll concentration and for interested users water temperature on a daily basis. (WP2)

Water quality data was gathered for several potential use cases (lakes). To train a model the best combination of available data was: high-frequency data of phytoplankton or cyanobacteria abundance over a longer period, satellite data for the spatial component, and simultaneous time series of meteorological data and water temperature data. The in-situ data was processed, quality controlled and stored in an easily accessible format (WP3).

The satellite data needed to be processed into maps of algae and cyanobacterial pigment concentrations, for which finding a suitable method for correction for atmospheric effects was an important step. After listing all data, it appeared that there was not enough satellite data (due to overpass frequency, clouds, and atmospheric correction problems) to include the data of Sentinel-2 in the model training. Therefore, the focus was put on high-frequency in situ measurements. These were available for one lake, which was then selected as the case study site. Data from in situ lab measurements, in situ optical measurements and satellite data were combined, to study the spatial and temporal distribution. From a modelling perspective, the most important observation was that two blooms occur: a spring bloom and a summer/early autumn bloom (WP4).

Because there was not a long term dataset of water temperature available, these were modelled using the FLake model (www.flake.igb-berlin.de). FLake is a freshwater lake model that is used for predicting the vertical water temperature structure and mixing conditions in lakes, based on air temperature and on the integral budgets of heat and kinetic energy.
A random forest application was chosen for the INNO-CYANO model. Random forest is a data-driven model that needs observation data and runs fast after the set up. A random forest algorithm uses input data to build decision trees that will be used predict the results. The available input was divided into a test and a training dataset. The training dataset was used to set up the random forest model and the test set was used for testing the results. Predicted chlorophyll concentrations during bloom events are slightly lower than the measured concentrations. The INNO-CYANO model yields a performance with 89% accuracy (WP5). However, it must be noted that in this case the training and testing set were not independent. At the time of writing the model performance is being tested on a new, independent case.

Final results

Published cyanobacterial bloom forecast models are mainly area specific, for example, several models are designed for Lake Taihu (China) and Lake Erie (USA/Canada). The models are generally ecological models, which are very complex and require amongst others nutrient loads as input. To monitor nutrient loads laborious monitoring is needed. The INNO-CYANO model is a simple, statistical data-driven bloom model, which can be adapted easily for different locations. For users, it will be combined with the WISPstation (a semi-continuous monitoring instrument), and – where suitable – with satellite based maps, which makes a very complete service package that covers all spatial and temporal dimensions.

Website & more info

More info: http://www.waterinsight.nl.