research-article

Public Access

KAME: Knowledge-based Attention Model for Diagnosis Prediction in Healthcare

Authors:
Fenglong Ma

State University of New York at Buffalo, Buffalo, NY, USA

State University of New York at Buffalo, Buffalo, NY, USA
View Profile

,
Quanzeng You

Microsoft AI & Research, Redmond, WA, USA

Microsoft AI & Research, Redmond, WA, USA
View Profile

,
Houping Xiao

Georgia State University, Atlanta, GA, USA

Georgia State University, Atlanta, GA, USA
View Profile

,
Radha Chitta

Kira Systems, Toronto, ON, Canada

Kira Systems, Toronto, ON, Canada
View Profile

,
Jing Zhou

eHealth Inc., Mountain View, CA, USA

eHealth Inc., Mountain View, CA, USA
View Profile

,
Jing Gao

State University of New York at Buffalo, Buffalo, NY, USA

State University of New York at Buffalo, Buffalo, NY, USA
View Profile

CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge ManagementOctober 2018Pages 743–752https://doi.org/10.1145/3269206.3271701

Published:17 October 2018Publication History

CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management

Pages 743–752

ABSTRACT

The goal of diagnosis prediction task is to predict the future health information of patients from their historical Electronic Healthcare Records (EHR). The most important and challenging problem of diagnosis prediction is to design an accurate, robust and interpretable predictive model. Existing work solves this problem by employing recurrent neural networks (RNNs) with attention mechanisms, but these approaches suffer from the data sufficiency problem. To obtain good performance with insufficient data, graph-based attention models are proposed. However, when the training data are sufficient, they do not offer any improvement in performance compared with ordinary attention-based models. To address these issues, we propose KAME, an end-to-end, accurate and robust model for predicting patients' future health information. KAME not only learns reasonable embeddings for nodes in the knowledge graph, but also exploits general knowledge to improve the prediction accuracy with the proposed knowledge attention mechanism. With the learned attention weights, KAME allows us to interpret the importance of each piece of knowledge in the graph. Experimental results on three real world datasets show that the proposed KAME significantly improves the prediction performance compared with the state-of-the-art approaches, guarantees the robustness with both sufficient and insufficient data, and learns interpretable disease representations.

References

Jimmy Ba, Volodymyr Mnih, and Koray Kavukcuoglu. 2015. Multiple Object Recognition with Visual Attention. Proceedings of the 3rd International Conference on Learning Representations (ICLR'15) .Google Scholar
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate. In Proceedings of the 3rd International Conference on Learning Representations (ICLR'15) .Google Scholar
Inci M. Baytas, Cao Xiao, Xi Zhang, Fei Wang, Anil K. Jain, and Jiayu Zhou. 2017. Patient Subtyping via Time-Aware LSTM Networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'17). 65--74. Google ScholarDigital Library
Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. In Advances in Neural Information Processing Systems (NIPS'13). 2787--2795. Google ScholarDigital Library
Zhengping Che, David Kale, Wenzhe Li, Mohammad Taha Bahadori, and Yan Liu. 2015. Deep computational phenotyping. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'15). 507--516. Google ScholarDigital Library
Zhengping Che, Sanjay Purushotham, Kyunghyun Cho, David Sontag, and Yan Liu. 2016. Recurrent Neural Networks for Multivariate Time Series with Missing Values. arXiv preprint arXiv:1606.01865 (2016).Google Scholar
Yu Cheng, Fei Wang, Ping Zhang, and Jianying Hu. 2016. Risk Prediction with Electronic Health Records: A Deep Learning Approach. In Proceedings of the 2016 SIAM International Conference on Data Mining (SDM'16). 432--440.Google ScholarCross Ref
Kyunghyun Cho, Bart Van Merriënboer, Dzmitry Bahdanau, and Yoshua Bengio. 2014. On the Properties of Neural Machine Translation: Encoder-Decoder Approaches. arXiv preprint arXiv:1409.1259 (2014).Google Scholar
Edward Choi, Mohammad Taha Bahadori, Elizabeth Searles, Catherine Coffey, Michael Thompson, James Bost, Javier Tejedor-Sojo, and Jimeng Sun. 2016. Multi-layer Representation Learning for Medical Concepts. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'16). 1495--1504. Google ScholarDigital Library
Edward Choi, Mohammad Taha Bahadori, Le Song, Walter F. Stewart, and Jimeng Sun. 2017. GRAM: Graph-based Attention Model for Healthcare Representation Learning. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'17). 787--795. Google ScholarDigital Library
Edward Choi, Mohammad Taha Bahadori, Jimeng Sun, Joshua Kulas, Andy Schuetz, and Walter Stewart. 2016. RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism. In Advances in Neural Information Processing Systems (NIPS'16). 3504--3512. Google ScholarDigital Library
Jan K. Chorowski, Dzmitry Bahdanau, Dmitriy Serdyuk, Kyunghyun Cho, and Yoshua Bengio. 2015. Attention-based Models for Speech Recognition. In Advances in Neural Information Processing Systems (NIPS'15). 577--585. Google ScholarDigital Library
Yuxiao Dong, Nitesh V. Chawla, and Ananthram Swami. 2017. Metapath2Vec: Scalable Representation Learning for Heterogeneous Networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'17). 135--144. Google ScholarDigital Library
Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'16). ACM, 855--864. Google ScholarDigital Library
Karl Moritz Hermann, Tomas Kocisky, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. 2015. Teaching Machines to Read and Comprehend. In Advances in Neural Information Processing Systems (NIPS'15). 1693--1701. Google ScholarDigital Library
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Computation, Vol. 9, 8 (1997), 1735--1780. Google ScholarDigital Library
Yann Jacob, Ludovic Denoyer, and Patrick Gallinari. 2014. Learning latent representations of nodes for classifying in heterogeneous social networks. In Proceedings of the 7th ACM International Conference on Web Search and Data Mining (WSDM'14). 373--382. Google ScholarDigital Library
Alex M. Lamb, Anirudh Goyal ALIAS PARTH GOYAL, Ying Zhang, Saizheng Zhang, Aaron C. Courville, and Yoshua Bengio. 2016. Professor Forcing: A New Algorithm for Training Recurrent Networks. In Advances In Neural Information Processing Systems (NIPS'16). 4601--4609. Google ScholarDigital Library
Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, and Xuan Zhu. 2015. Learning entity and relation embeddings for knowledge graph completion. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI'15). 2181--2187. Google ScholarDigital Library
Zachary C. Lipton, David C. Kale, Charles Elkan, and Randall Wetzell. 2015. Learning to Diagnose with LSTM Recurrent Neural Networks. In International Conference on Learning Representations (ICLR'16).Google Scholar
Zachary C. Lipton, David C. Kale, and Randall Wetzel. 2016. Modeling Missing Data in Clinical Time Series with RNNs. In Proceedings of Machine Learning for Healthcare (MLHC'16).Google Scholar
Minh-Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. Effective Approaches to Attention-based Neural Machine Translation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP'15). 1412--1421.Google Scholar
Fenglong Ma, Radha Chitta, Jing Zhou, Quanzeng You, Tong Sun, and Jing Gao. 2017. Dipole: Diagnosis Prediction in Healthcare via Attention-based Bidirectional Recurrent Neural Networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'17). 1903--1911. Google ScholarDigital Library
Fenglong Ma, Gao Jing, Qiuling Suo, Quanzeng You, Jing Zhou, and Aidong Zhang. 2018. Risk Prediction on Electronic Health Records with Prior Medical Knowledge. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'18). ACM, 1910--1919. Google ScholarDigital Library
Fenglong Ma, Chuishi Meng, Houping Xiao, Qi Li, Jing Gao, Lu Su, and Aidong Zhang. 2017. Unsupervised Discovery of Drug Side-Effects from Heterogeneous Data Sources. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'17). 967--976. Google ScholarDigital Library
Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research, Vol. 9, Nov (2008), 2579--2605.Google Scholar
Riccardo Miotto, Fei Wang, Shuang Wang, Xiaoqian Jiang, and Joel T. Dudley. 2017. Deep Learning for Healthcare: Review, Opportunities and Challenges. Briefings in Bioinformatics (2017), bbx044.Google Scholar
Phuoc Nguyen, Truyen Tran, Nilmini Wickramasinghe, and Svetha Venkatesh. 2016. Deepr: A Convolutional Net for Medical Records. IEEE Journal of Biomedical and Health Informatics (2016).Google Scholar
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'14). ACM, 701--710. Google ScholarDigital Library
Trang Pham, Truyen Tran, Dinh Phung, and Svetha Venkatesh. 2016. Deepcare: A deep dynamic memory model for predictive medicine. In Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'16). 30--41. Google ScholarDigital Library
Alvin Rajkomar, Eyal Oren, Andrew M. Dai Kai Chen, Nissan Hajaj, Peter J. Liu, Xiaobing Liu, Mimi Sun, Patrik Sundberg, Hector Yee, Kun Zhang, Gavin E. Duggan, Gerardo Flores, Michaela Hardt, Jamie Irvine, Quoc Le, Kurt Litsch, Jake Marcus, Alexander Mossin, Justin Tansuwanand De Wang, James Wexler, Jimbo Wilson, Dana Ludwig, Samuel L. Volchenboum, Katherine Chou, Michael Pearson, Srinivasan Madabushi, Nigam H. Shah, Atul J. Butte, Michael Howell, Claire Cui, Greg Corrado, and Jeff Dean. 2018. Scalable and accurate deep learning for electronic health records. arXiv preprint arXiv:1801.07860 (2018).Google Scholar
Leonardo F. R. Ribeiro, Pedro H. P. Saverese, and Daniel R. Figueiredo. 2017. Struc2Vec: Learning Node Representations from Structural Identity. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'17). 385--394. Google ScholarDigital Library
Qiuling Suo, Fenglong Ma, Giovanni Canino, Jing Gao, Aidong Zhang, Pierangelo Veltri, and Agostino Gnasso. 2017. A Multi-task Framework for Monitoring Health Conditions via Attention-based Recurrent Neural Networks. In Proceedings of the AMIA 2017 Annual Symposium (AMIA'17) .Google Scholar
Qiuling Suo, Fenglong Ma, Ye Yuan, Mengdi Huai, Weida Zhong, Jing Gao, and Aidong Zhang. 2017. Personalized Disease Prediction Using A CNN-Based Similarity Learning Method. In Proceedings of The IEEE International Conference on Bioinformatics and Biomedicine (BIBM'17). 811--816.Google ScholarCross Ref
Qiuling Suo, Fenglong Ma, Ye Yuan, Mengdi Huai, Weida Zhong, Jing Gao, and Aidong Zhang. 2018. Deep Patient Similarity Learning for Personalized Healthcare. IEEE Transactions on NanoBioscience (2018), 219--227.Google ScholarCross Ref
Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. 2015. Line: Large-scale information network embedding. Proceedings of the 24th International Conference on World Wide Web (WWW'14). 1067--1077. Google ScholarDigital Library
The Theano Development Team. 2016. Theano: A Python framework for fast computation of mathematical expressions. arXiv preprint arXiv:1605.02688 (2016).Google Scholar
Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. 2014. Knowledge graph embedding by translating on hyperplanes. In Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence (AAAI'14). 1112--1119. Google ScholarDigital Library
Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard S. Zemel, and Yoshua Bengio. 2015. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. In Proceedings of the 32nd International Conference on Machine Learning (ICML'15). 2048--2057. Google ScholarDigital Library
Quanzeng You, Hailin Jin, Zhaowen Wang, Chen Fang, and Jiebo Luo. 2016. Image Captioning with Semantic Attention. In Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR'16). 4651--4659.Google ScholarCross Ref
Ye Yuan, Guangxu Xun, Fenglong Ma, Qiuling Suo, Hongfei Xue, Kebin Jia, and Aidong Zhang. 2018. A Novel Channel-aware Attention Framework for Multi-Channel EEG Seizure Detection via Multi-view Deep Learning. In Proceedings of the 2018 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI'18). IEEE, 206--209.Google ScholarCross Ref
Matthew D. Zeiler. 2012. ADADELTA: An Adaptive Learning Rate Method. arXiv preprint arXiv:1212.5701 (2012).Google Scholar
Shiyue Zhang, Pengtao Xie, Dong Wang, and Eric P. Xing. 2017. Medical Diagnosis From Laboratory Tests by Combining Generative and Discriminative Learning. arXiv preprint arXiv:1711.04329 (2017).Google Scholar

Index Terms

KAME: Knowledge-based Attention Model for Diagnosis Prediction in Healthcare
1. Applied computing
  1. Life and medical sciences
    1. Health informatics
2. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

MedPath: Augmenting Health Risk Prediction via Medical Knowledge Paths
WWW '21: Proceedings of the Web Conference 2021

The broad adoption of electronic health records (EHR) data and the availability of biomedical knowledge graphs (KGs) on the web have provided clinicians and researchers unprecedented resources and opportunities for conducting health risk predictions to ...
Read More
Risk Prediction on Electronic Health Records with Prior Medical Knowledge
KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Predicting the risk of potential diseases from Electronic Health Records (EHR) has attracted considerable attention in recent years, especially with the development of deep learning techniques. Compared with traditional machine learning models, deep ...
Read More
Dipole: Diagnosis Prediction in Healthcare via Attention-based Bidirectional Recurrent Neural Networks
KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Predicting the future health information of patients from the historical Electronic Health Records (EHR) is a core research task in the development of personalized healthcare. Patient EHR data consist of sequences of visits over time, where each visit ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management
October 2018
2362 pages
ISBN:9781450360142
DOI:10.1145/3269206
General Chair:
Alfredo Cuzzocrea
University of Trieste, Italy
,
Program Chairs:
James Allan
University of Massachusetts, USA
,
Norman Paton
University of Manchester, United Kingdom
,
Divesh Srivastava
AT&T Labs Research, USA
,
Rakesh Agrawal
Data Insights Lab, USA
,
Andrei Broder
Google Research, USA
,
Mohammed Zaki
Rensselaer Polytechnic Institute, USA
,
Selcuk Candan
Arizona State University, USA
,
Alexandros Labrinidis
University of Pittsburgh, USA
,
Assaf Schuster
Technion, Israel
,
Haixun Wang
Google Research, USA
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 October 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
healthcare informatics
knowledge attention mechanism
medical knowledge graph
Qualifiers
- research-article
Conference

Acceptance Rates
CIKM '18 Paper Acceptance Rate147of826submissions,18%Overall Acceptance Rate1,861of8,427submissions,22%
More
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 126
  Total Citations
  View Citations
- 3,170
  Total Downloads
- Downloads (Last 12 months)628
- Downloads (Last 6 weeks)102
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

KAME: Knowledge-based Attention Model for Diagnosis Prediction in Healthcare

CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management

ABSTRACT

References

Cited By

Index Terms

Recommendations

MedPath: Augmenting Health Risk Prediction via Medical Knowledge Paths

Risk Prediction on Electronic Health Records with Prior Medical Knowledge

Dipole: Diagnosis Prediction in Healthcare via Attention-based Bidirectional Recurrent Neural Networks

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media