research-article

Public Access

Dipole: Diagnosis Prediction in Healthcare via Attention-based Bidirectional Recurrent Neural Networks

Authors:
Fenglong Ma

SUNY Buffalo & Xerox, Buffalo, NY, USA

SUNY Buffalo & Xerox, Buffalo, NY, USA
View Profile

,
Radha Chitta

Conduent Labs US, Rochester, NY, USA

Conduent Labs US, Rochester, NY, USA
View Profile

,
Jing Zhou

Conduent Labs US, Rochester, NY, USA

Conduent Labs US, Rochester, NY, USA
View Profile

,
Quanzeng You

University of Rochester, Rochester, NY, USA

University of Rochester, Rochester, NY, USA
View Profile

,
Tong Sun

United Technologies Research Center & Xerox, East Hartford, CT, USA

United Technologies Research Center & Xerox, East Hartford, CT, USA
View Profile

,
Jing Gao

SUNY Buffalo, Buffalo, NY, USA

SUNY Buffalo, Buffalo, NY, USA
View Profile

KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data MiningAugust 2017Pages 1903–1911https://doi.org/10.1145/3097983.3098088

Published:13 August 2017Publication History

KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Pages 1903–1911

ABSTRACT

Predicting the future health information of patients from the historical Electronic Health Records (EHR) is a core research task in the development of personalized healthcare. Patient EHR data consist of sequences of visits over time, where each visit contains multiple medical codes, including diagnosis, medication, and procedure codes. The most important challenges for this task are to model the temporality and high dimensionality of sequential EHR data and to interpret the prediction results. Existing work solves this problem by employing recurrent neural networks (RNNs) to model EHR data and utilizing simple attention mechanism to interpret the results. However, RNN-based approaches suffer from the problem that the performance of RNNs drops when the length of sequences is large, and the relationships between subsequent visits are ignored by current RNN-based approaches. To address these issues, we propose Dipole, an end-to-end, simple and robust model for predicting patients' future health information. Dipole employs bidirectional recurrent neural networks to remember all the information of both the past visits and the future visits, and it introduces three attention mechanisms to measure the relationships of different visits for the prediction. With the attention mechanisms, Dipole can interpret the prediction results effectively. Dipole also allows us to interpret the learned medical code representations which are confirmed positively by medical experts. Experimental results on two real world EHR datasets show that the proposed Dipole can significantly improve the prediction accuracy compared with the state-of-the-art diagnosis prediction approaches and provide clinically meaningful interpretation.

References

Jimmy Ba, Volodymyr Mnih, and Koray Kavukcuoglu 2015. Multiple Object Recognition with Visual Attention. Proceedings of the 3rd International Conference on Learning Representations (ICLR'15).Google Scholar
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate Proceedings of the 3rd International Conference on Learning Representations (ICLR'15).Google Scholar
Yoshua Bengio, Patrice Simard, and Paolo Frasconi. 1994. Learning Long-Term Dependencies with Gradient Descent is Difficult. IEEE Transactions on Neural Networks Vol. 5, 2 (1994), 157--166. Google ScholarDigital Library
James Bergstra, Olivier Breuleux, Frédéric Bastien, Pascal Lamblin, Razvan Pascanu, Guillaume Desjardins, Joseph Turian, David Warde-Farley, and Yoshua Bengio. 2010. Theano: A CPU and GPU Math Compiler in Python. In Proceedings of the 9th Python in Science Conference (SciPy'10). 1--7.Google ScholarCross Ref
Zhengping Che, David Kale, Wenzhe Li, Mohammad Taha Bahadori, and Yan Liu 2015. Deep computational phenotyping. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'15). ACM, 507--516. Google ScholarDigital Library
Zhengping Che, Sanjay Purushotham, Kyunghyun Cho, David Sontag, and Yan Liu 2016. Recurrent Neural Networks for Multivariate Time Series with Missing Values. arXiv preprint arXiv:1606.01865 (2016).Google Scholar
Yu Cheng, Fei Wang, Ping Zhang, and Jianying Hu. 2016. Risk Prediction with Electronic Health Records: A Deep Learning Approach Proceedings of the 2016 SIAM International Conference on Data Mining (SDM'16). 432--440.Google Scholar
Kyunghyun Cho, Bart Van Merriënboer, Dzmitry Bahdanau, and Yoshua Bengio 2014. On the Properties of Neural Machine Translation: Encoder-Decoder Approaches. arXiv preprint arXiv:1409.1259 (2014).Google Scholar
Edward Choi, Mohammad Taha Bahadori, Elizabeth Searles, Catherine Coffey, Michael Thompson, James Bost, Javier Tejedor-Sojo, and Jimeng Sun. 2016. Multi-layer Representation Learning for Medical Concepts Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'16). 1495--1504.Google Scholar
Edward Choi, Mohammad Taha Bahadori, Le Song, Walter F Stewart, and Jimeng Sun 2017. GRAM: Graph-based Attention Model for Healthcare Representation Learning Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'17). ACM.Google ScholarDigital Library
Edward Choi, Mohammad Taha Bahadori, Jimeng Sun, Joshua Kulas, Andy Schuetz, and Walter Stewart. 2016. RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism. In Advances in Neural Information Processing Systems (NIPS'16). 3504--3512.Google ScholarDigital Library
Edward Choi, Nan Du, Robert Chen, Le Song, and Jimeng Sun 2015. Constructing Disease Network and Temporal Progression Model via Context-sensitive Hawkes Process. In 2015 IEEE International Conference on Data Mining (ICDM'15). IEEE, 721--726. Google ScholarDigital Library
Jan K Chorowski, Dzmitry Bahdanau, Dmitriy Serdyuk, Kyunghyun Cho, and Yoshua Bengio. 2015. Attention-based Models for Speech Recognition. In Advances in Neural Information Processing Systems (NIPS'15). 577--585.Google Scholar
Karl Moritz Hermann, Tomas Kocisky, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom 2015. Teaching Machines to Read and Comprehend. In Advances in Neural Information Processing Systems (NIPS'15). 1693--1701.Google Scholar
Sepp Hochreiter and Jürgen Schmidhuber 1997. Long Short-Term Memory. Neural Computation, Vol. 9, 8 (1997), 1735--1780. Google ScholarDigital Library
Peter B Jensen, Lars J Jensen, and Søren Brunak. 2012. Mining Electronic Health Records: Towards Better Research Applications and Clinical Care. Nature Reviews Genetics Vol. 13, 6 (2012), 395--405.Google ScholarCross Ref
Alex M Lamb, Anirudh Goyal ALIAS PARTH GOYAL, Ying Zhang, Saizheng Zhang, Aaron C Courville, and Yoshua Bengio 2016. Professor Forcing: A New Algorithm for Training Recurrent Networks Advances In Neural Information Processing Systems (NIPS'16). 4601--4609.Google Scholar
Zachary C Lipton, David C Kale, and Randall Wetzel. 2016. Modeling Missing Data in Clinical Time Series with RNNs Proceedings of Machine Learning for Healthcare (MLHC'16).Google Scholar
Chuanren Liu, Fei Wang, Jianying Hu, and Hui Xiong. 2015. Temporal Phenotyping from Longitudinal Electronic Health Records: A Graph based Framework Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'15). ACM, 705--714.Google Scholar
Minh-Thang Luong, Hieu Pham, and Christopher D Manning. 2015. Effective Approaches to Attention-based Neural Machine Translation Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP'15). 1412--1421.Google Scholar
Fenglong Ma, Chuishi Meng, Houping Xiao, Qi Li, Jing Gao, Lu Su, and Aidong Zhang 2017. Unsupervised Discovery of Drug Side-Effects from Heterogeneous Data Sources Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'17). ACM.Google Scholar
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean 2013. Distributed Representations of Words and Phrases and Their Compositionality Advances in Neural Information Processing Systems (NIPS'13). 3111--3119.Google Scholar
Phuoc Nguyen, Truyen Tran, Nilmini Wickramasinghe, and Svetha Venkatesh 2016. Deepr: A Convolutional Net for Medical Records. IEEE Journal of Biomedical and Health Informatics (2016).Google Scholar
Mike Schuster and Kuldip K Paliwal 1997. Bidirectional Recurrent Neural Networks. IEEE Transactions on Signal Processing Vol. 45, 11 (1997), 2673--2681. Google ScholarDigital Library
Qiuling Suo, Fenglong Ma, Giovanni Canino, Jing Gao, Aidong Zhang, Pierangelo Veltri, and Agostino Gnasso 2017. A Multi-task Framework for Monitoring Health Conditions via Attention-based Recurrent Neural Networks. In Proceedings of the AMIA 2017 Annual Symposium (AMIA'17).Google Scholar
Xiang Wang, David Sontag, and Fei Wang 2014. Unsupervised Learning of Disease Progression Models Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'14). ACM, 85--94.Google Scholar
Houping Xiao, Jing Gao, Long Vu, and Deepak S. Turaga. 2017. Learning Temporal State of Diabetes Patients via Combining Behavioral and Demographic Data Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'17). ACM.Google Scholar
Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard S Zemel, and Yoshua Bengio. 2015. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention Proceedings of the 32nd International Conference on Machine Learning (ICML'15). CoRR, 2048--2057.Google Scholar
Quanzeng You, Hailin Jin, Zhaowen Wang, Chen Fang, and Jiebo Luo 2016. Image Captioning with Semantic Attention. In Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR'16). 4651--4659. Google ScholarCross Ref
Matthew D Zeiler. 2012. ADADELTA: An Adaptive Learning Rate Method. arXiv preprint arXiv:1212.5701 (2012).Google Scholar
Jiayu Zhou, Jimeng Sun, Yashu Liu, Jianying Hu, and Jieping Ye 2013. Patient Risk Prediction Model via Top-k Stability Selection Proceedings of the 13th SIAM International Conference on Data Mining (SDM'13). SIAM, 55--63.Google Scholar
Jiayu Zhou, Fei Wang, Jianying Hu, and Jieping Ye. 2014. From Micro to Macro: Data Driven Phenotyping by Densification of Longitudinal Electronic Medical Records. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'14). ACM, 135--144. Google ScholarDigital Library
Jiayu Zhou, Lei Yuan, Jun Liu, and Jieping Ye. 2011. A Multi-Task Learning Formulation for Predicting Disease Progression Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'11). ACM, 814--822. endthebibliographyGoogle Scholar

Index Terms

Dipole: Diagnosis Prediction in Healthcare via Attention-based Bidirectional Recurrent Neural Networks
1. Applied computing
  1. Life and medical sciences
    1. Health informatics
2. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

MedPath: Augmenting Health Risk Prediction via Medical Knowledge Paths
WWW '21: Proceedings of the Web Conference 2021

The broad adoption of electronic health records (EHR) data and the availability of biomedical knowledge graphs (KGs) on the web have provided clinicians and researchers unprecedented resources and opportunities for conducting health risk predictions to ...
Read More
Identifying fall-related injuries: Text mining the electronic medical record

Unintentional injury due to falls is a serious and expensive health problem among the elderly. This is especially true in the Veterans Health Administration (VHA) ambulatory care setting, where nearly 40% of the male patients are 65 or older and at risk ...
Read More
Distilling Knowledge from Publicly Available Online EMR Data to Emerging Epidemic for Prognosis
WWW '21: Proceedings of the Web Conference 2021

Due to the characteristics of COVID-19, the epidemic develops rapidly and overwhelms health service systems worldwide. Many patients suffer from life-threatening systemic problems and need to be carefully monitored in ICUs. An intelligent prognosis can ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
August 2017
2240 pages
ISBN:9781450348874
DOI:10.1145/3097983
General Chairs:
Stan Matwin
Dalhousie University
,
Shipeng Yu
LinkedIn
,
Faisal Farooq
IBM
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 August 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
attention mechanism
bidirectional recurrent neural networks
healthcare informatics
Qualifiers
- research-article
Conference

Acceptance Rates
KDD '17 Paper Acceptance Rate64of748submissions,9%Overall Acceptance Rate1,133of8,635submissions,13%
More
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 344
  Total Citations
  View Citations
- 3,680
  Total Downloads
- Downloads (Last 12 months)598
- Downloads (Last 6 weeks)93
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Dipole: Diagnosis Prediction in Healthcare via Attention-based Bidirectional Recurrent Neural Networks

KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

MedPath: Augmenting Health Risk Prediction via Medical Knowledge Paths

Identifying fall-related injuries: Text mining the electronic medical record

Distilling Knowledge from Publicly Available Online EMR Data to Emerging Epidemic for Prognosis

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media