skip to main content
10.1145/3219819.3219835acmotherconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Using Machine Learning to Assess the Risk of and Prevent Water Main Breaks

Published: 19 July 2018 Publication History

Abstract

Water infrastructure in the United States is beginning to show its age, particularly through water main breaks. Main breaks cause major disruptions in everyday life for residents and businesses. Water main failures in Syracuse, N.Y. (as in most cities) are handled reactively rather than proactively. A barrier to proactive maintenance with limited resources is the city's inability to properly prioritize the allocation of its resources. We built a Machine Learning system to assess the risk of a water mains breaking. Using historical data on which mains have failed, descriptors of pipes, and other data sources, we evaluated several models' abilities to predict breaks three years into the future. Our results show that our system using gradient boosted decision trees performed best out of several algorithms and expert heuristics, achieving precision at 1% (P@1) of 0.62. Our model outperforms a random baseline (P@1 of 0.08) and expert heuristics such as water main age (P@1 of 0.10) and history of past main breaks (P@1 of 0.48). The model is currently deployed in the City of Syracuse. We are conducting a pilot by calculating the risk of failure for each city block over the period 2016-2018 using data up to the end of 2015 and, as of the end of 2017, there have been 42 breaks on our riskiest 52 mains. This has been a successful initiative for the city of Syracuse in improving its infrastructure and we believe this approach can be applied to other cities.

Supplementary Material

MP4 File (kumar_water_main_breaks.mp4)

References

[1]
American Society of Civil Engineers. 2013. 2013 Report Card for America's Infrastructure.
[2]
Geoffrey Anderson. 2011. The Problem with Potholes: Neglected Road Repair Poses Huge Liabilities for Many States. (June 2011). http://www.huffingtonpost.com/geoffrey-anderson/the-problem-with-potholes_b_870196.html
[3]
A. Asnaashari, E. A. McBean, I. Shahrour, and B. Gharabaghi. 2009. Prediction of watermain failure frequencies using multiple and Poisson regression. Water Science &Technology: Water Supply 9, 1 (March 2009), 9.
[4]
J. B. Boxall, A. O'Hagan, S. Pooladsaz, A. J. Saul, and D. M. Unwin. 2007. Estimation of burst rates in water distribution mains. Proceedings of the Institution of Civil Engineers - Water Management 160, 2 (June 2007), 73--82.
[5]
David Chanatry. 2012. America's 'Most Polluted' Lake Finally Comes Clean. (Jul 2012). http://www.npr.org/2012/07/31/157413747/americas-most-polluted-lake-finally-comes-clean
[6]
Jerome H. Friedman. 2002. Stochastic gradient boosting. Computational Statistics &Data Analysis 38, 4 (Feb. 2002), 367--378.
[7]
Arthur Getis and J. K. Ord. 1992. The Analysis of Spatial Association by Use of Distance Statistics. Geographical Analysis 24, 3 (1992), 189--206.
[8]
Rob J. Hyndman and George Athanasopoulos. 2014. Forecasting: principles and practice (print edition ed.). OTexts, S.l.
[9]
Syracuse I-Team. 2016. City of Syracuse I-Team Infrastructure Final Report. Technical Report. https://drive.google.com/file/d/0B2Xo82GXTbPdcktQU0RVVk1qZHc/view
[10]
Sara Jerome. 2017. Infrastructure Crisis Evident In Mega Main Breaks. Water Online (Jan. 2017). https://www.wateronline.com/doc/infrastructure-crisis-evident-mega-main-breaks-0001
[11]
Golam Kabir, Solomon Tesfamariam, and Rehan Sadiq. 2015. Predicting water main failures using Bayesian model averaging and survival modelling approach. Reliability Engineering &System Safety 142 (Oct. 2015), 498--514.
[12]
Yehuda Kleiner and Balvant Rajani. 2001. Comprehensive review of structural deterioration of water mains: statistical models. Urban Water 3, 3 (Sept. 2001), 131--150.
[13]
Yves Le Gat and Patrick Eisenbeis. 2000. Using maintenance records to forecast failures in water networks. Urban Water 2, 3 (Sept. 2000), 173--181.
[14]
Gilles Louppe. 2014. Understanding Random Forests: From Theory to Practice. Ph.D. Dissertation. University of Liege, Belgium. arXiv:1407.7502.
[15]
Gilles Louppe, Louis Wehenkel, Antonio Sutera, and Pierre Geurts. 2013. Understanding Variable Importances in Forests of Randomized Trees. In Proceedings of the 26th International Conference on Neural Information Processing Systems (NIPS'13). Curran Associates Inc., USA, 431--439. http://dl.acm.org/citation.cfm?id=2999611.2999660
[16]
Alain Mailhot, Geneviéve Pelletier, Jean-François Noál, and Jean-Pierre Villeneuve. 2000. Modeling the evolution of the structural state of water pipe networks with brief recorded pipe break histories: Methodology and application. Water Resources Research 36, 10 (Oct. 2000), 3053--3062.
[17]
R. E. Morris. 1967. Principal Causes and Remedies of Water Main Breaks. Journal (American Water Works Association) 59, 7 (1967), 782--798. http://www.jstor.org/stable/41265135
[18]
Alexandru Niculescu-Mizil and Rich Caruana. 2005. Predicting good probabilities with supervised learning. In Proceedings of the 22nd international conference on Machine learning. ACM, 625--632.
[19]
City of Flint. 2015. State of Emergency Declared for the City of Flint. (Dec. 2015). https://www.cityofflint.com/state-of-emergency/
[20]
J. K. Ord and Arthur Getis. 1995. Local Spatial Autocorrelation Statistics: Distributional Issues and an Application. Geographical Analysis 27, 4 (1995), 286--306.
[21]
Geneviéve Pelletier, Alain Mailhot, and Jean-Pierre Villeneuve. 2003. Modeling Water Pipe Breaks: Three Case Studies. Journal of Water Resources Planning and Management 129, 2 (March 2003), 115--123.
[22]
Libby Sander and Susan Saulny. 2017. Bridge Collapse in Minneapolis Kills at Least 7. The New York Times (Jan. 2017), A1. http://www.nytimes.com/2007/08/02/us/02bridge.html
[23]
Andreas Scheidegger, Lisa Scholten, Max Maurer, and Peter Reichert. 2013. Extension of pipe failure models to consider the absence of data from replaced pipes. Water Research 47, 11 (July 2013), 3696--3705.
[24]
Alana Semuels. 2016. A Tale of Two Water Systems. The Atlantic(Feb. 2016). https://www.theatlantic.com/business/archive/2016/02/a-tale-of-two-water-systems/471207/
[25]
Uri Shamir and Charles D. D. Howard. 1979. An Analytic Approach to Scheduling Pipe Replacement. Journal (American Water Works Association) 71, 5 (1979), 248--258. http://www.jstor.org/stable/41270297
[26]
A. Vanrenterghem-Raven, P. Eisenbeis, I. Juran, and S. Christodoulou. 2003. Statistical Modeling of the Structural Degradation of an UrbanWater Distribution System: Case Study of New York City. American Society of Civil Engineers, 1--10.

Cited By

View all
  • (2024)Integrated intelligent models for predicting water pipe failure probabilityAlexandria Engineering Journal10.1016/j.aej.2023.11.04786(243-257)Online publication date: Jan-2024
  • (2024)Pipeline Management Technologies for Sustainable Water Supply in a Smart City EnvironmentEncyclopedia of Sustainable Technologies10.1016/B978-0-323-90386-8.00055-3(671-679)Online publication date: 2024
  • (2023)Toward Sustainable Water Infrastructure: The State‐Of‐The‐Art for Modeling the Failure Probability of Water PipesWater Resources Research10.1029/2022WR03325659:4Online publication date: 30-Mar-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
July 2018
2925 pages
ISBN:9781450355520
DOI:10.1145/3219819
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 July 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. city planning
  2. civic teach
  3. social good

Qualifiers

  • Research-article

Funding Sources

  • Eric and Wendy Schmidt Family Foundation

Conference

KDD '18
Sponsor:

Acceptance Rates

KDD '18 Paper Acceptance Rate 107 of 983 submissions, 11%;
Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)67
  • Downloads (Last 6 weeks)8
Reflects downloads up to 03 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Integrated intelligent models for predicting water pipe failure probabilityAlexandria Engineering Journal10.1016/j.aej.2023.11.04786(243-257)Online publication date: Jan-2024
  • (2024)Pipeline Management Technologies for Sustainable Water Supply in a Smart City EnvironmentEncyclopedia of Sustainable Technologies10.1016/B978-0-323-90386-8.00055-3(671-679)Online publication date: 2024
  • (2023)Toward Sustainable Water Infrastructure: The State‐Of‐The‐Art for Modeling the Failure Probability of Water PipesWater Resources Research10.1029/2022WR03325659:4Online publication date: 30-Mar-2023
  • (2023)Smart Water ManagementSpringer Handbook of Internet of Things10.1007/978-3-031-39650-2_33(805-824)Online publication date: 12-Dec-2023
  • (2023)Machine Learning and Computer Vision Applications in Civil Infrastructure Inspection and MonitoringInfrastructure Robotics10.1002/9781394162871.ch4(59-80)Online publication date: 15-Dec-2023
  • (2022)A Critical Review of the Sustainability of Multi-Utility Tunnels for Colocation of Subsurface InfrastructureFrontiers in Sustainable Cities10.3389/frsc.2022.8478194Online publication date: 17-Feb-2022
  • (2022)Algorithmic Rural Road Planning in India: Constrained Capacities and Choices in Public SectorProceedings of the 2nd ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization10.1145/3551624.3555299(1-11)Online publication date: 6-Oct-2022
  • (2022)Urban Anomaly Analytics: Description, Detection, and PredictionIEEE Transactions on Big Data10.1109/TBDATA.2020.29910088:3(809-826)Online publication date: 1-Jun-2022
  • (2022)Predicting the risk of pipe failure using gradient boosted decision trees and weighted risk analysisnpj Clean Water10.1038/s41545-022-00165-25:1Online publication date: 17-Jun-2022
  • (2021)Graph Convolutional Networks: Application to Database Completion of Wastewater NetworksWater10.3390/w1312168113:12(1681)Online publication date: 17-Jun-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media