A Data-Based Framework for Identifying a Source Location of a Contaminant Spill in a River System with Random Measurement Errors

Jun Hyeong Kim, Mi Lim Lee, Chuljin Park

Research output: Contribution to journalArticleResearchpeer-review

Abstract

This study addresses the problem of identifying the source location of a contaminant spill in a river system when a sensor network returns observations containing random measurement errors. To solve this problem, we suggest a new framework comprising three main steps: (i) spill detection, (ii) data preprocessing, and (iii) source identification. Specifically, we applied a statistical process control chart to detect a contaminant spill with measurement errors while keeping the false alarm rate at less than or equal to a user-specified value. After detecting a spill, we generated a nonlinear regression model to estimate a breakthrough curve of the observations and derive a characteristic vector of the estimated curve. Using the characteristic vector as an input, a random forest model was constructed with the sensor raising the first alarm. The model provides output values between 0 and 1 to represent the possibility of each candidate location being the true spill source. These possibility values allow users to identify strong candidate locations for the spill. The accuracy of our framework was tested on part of the Altamaha River system in Georgia, USA.

Original languageEnglish
JournalSensors (Basel, Switzerland)
Volume19
Issue number15
DOIs
StatePublished - 2019 Aug 1

Fingerprint

Random errors
Hazardous materials spills
Measurement errors
Rivers
rivers
contaminants
Impurities
Nonlinear Dynamics
warning systems
sensors
false alarms
charts
curves
preprocessing
regression analysis
Statistical process control
output
estimates
Sensor networks
Identification (control systems)

Keywords

  • random forest
  • river system
  • sensor network
  • source identification
  • statistical process control
  • water quality monitoring

Cite this

@article{40b9045c6b6848c2b2eff2303fed024a,
title = "A Data-Based Framework for Identifying a Source Location of a Contaminant Spill in a River System with Random Measurement Errors",
abstract = "This study addresses the problem of identifying the source location of a contaminant spill in a river system when a sensor network returns observations containing random measurement errors. To solve this problem, we suggest a new framework comprising three main steps: (i) spill detection, (ii) data preprocessing, and (iii) source identification. Specifically, we applied a statistical process control chart to detect a contaminant spill with measurement errors while keeping the false alarm rate at less than or equal to a user-specified value. After detecting a spill, we generated a nonlinear regression model to estimate a breakthrough curve of the observations and derive a characteristic vector of the estimated curve. Using the characteristic vector as an input, a random forest model was constructed with the sensor raising the first alarm. The model provides output values between 0 and 1 to represent the possibility of each candidate location being the true spill source. These possibility values allow users to identify strong candidate locations for the spill. The accuracy of our framework was tested on part of the Altamaha River system in Georgia, USA.",
keywords = "random forest, river system, sensor network, source identification, statistical process control, water quality monitoring",
author = "Kim, {Jun Hyeong} and Lee, {Mi Lim} and Chuljin Park",
year = "2019",
month = "8",
day = "1",
doi = "10.3390/s19153378",
language = "English",
volume = "19",
journal = "Sensors (Switzerland)",
issn = "1424-8220",
number = "15",

}

A Data-Based Framework for Identifying a Source Location of a Contaminant Spill in a River System with Random Measurement Errors. / Kim, Jun Hyeong; Lee, Mi Lim; Park, Chuljin.

In: Sensors (Basel, Switzerland), Vol. 19, No. 15, 01.08.2019.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - A Data-Based Framework for Identifying a Source Location of a Contaminant Spill in a River System with Random Measurement Errors

AU - Kim, Jun Hyeong

AU - Lee, Mi Lim

AU - Park, Chuljin

PY - 2019/8/1

Y1 - 2019/8/1

N2 - This study addresses the problem of identifying the source location of a contaminant spill in a river system when a sensor network returns observations containing random measurement errors. To solve this problem, we suggest a new framework comprising three main steps: (i) spill detection, (ii) data preprocessing, and (iii) source identification. Specifically, we applied a statistical process control chart to detect a contaminant spill with measurement errors while keeping the false alarm rate at less than or equal to a user-specified value. After detecting a spill, we generated a nonlinear regression model to estimate a breakthrough curve of the observations and derive a characteristic vector of the estimated curve. Using the characteristic vector as an input, a random forest model was constructed with the sensor raising the first alarm. The model provides output values between 0 and 1 to represent the possibility of each candidate location being the true spill source. These possibility values allow users to identify strong candidate locations for the spill. The accuracy of our framework was tested on part of the Altamaha River system in Georgia, USA.

AB - This study addresses the problem of identifying the source location of a contaminant spill in a river system when a sensor network returns observations containing random measurement errors. To solve this problem, we suggest a new framework comprising three main steps: (i) spill detection, (ii) data preprocessing, and (iii) source identification. Specifically, we applied a statistical process control chart to detect a contaminant spill with measurement errors while keeping the false alarm rate at less than or equal to a user-specified value. After detecting a spill, we generated a nonlinear regression model to estimate a breakthrough curve of the observations and derive a characteristic vector of the estimated curve. Using the characteristic vector as an input, a random forest model was constructed with the sensor raising the first alarm. The model provides output values between 0 and 1 to represent the possibility of each candidate location being the true spill source. These possibility values allow users to identify strong candidate locations for the spill. The accuracy of our framework was tested on part of the Altamaha River system in Georgia, USA.

KW - random forest

KW - river system

KW - sensor network

KW - source identification

KW - statistical process control

KW - water quality monitoring

UR - http://www.scopus.com/inward/record.url?scp=85071066689&partnerID=8YFLogxK

U2 - 10.3390/s19153378

DO - 10.3390/s19153378

M3 - Article

VL - 19

JO - Sensors (Switzerland)

JF - Sensors (Switzerland)

SN - 1424-8220

IS - 15

ER -