skip to main content
10.1145/3097983.3097998acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Open Access

Pharmacovigilance via Baseline Regularization with Large-Scale Longitudinal Observational Data

Published:13 August 2017Publication History

ABSTRACT

Several prominent public health incidents that occurred at the beginning of this century due to adverse drug events (ADEs) have raised international awareness of governments and industries about pharmacovigilance (PhV), the science and activities to monitor and prevent adverse events caused by pharmaceutical products after they are introduced to the market. A major data source for PhV is large-scale longitudinal observational databases (LODs) such as electronic health records (EHRs) and medical insurance claim databases. Inspired by the Multiple Self-Controlled Case Series (MSCCS) model, arguably the leading method for ADE discovery from LODs, we propose baseline regularization, a regularized generalized linear model that leverages the diverse health profiles available in LODs across different individuals at different times. We apply the proposed method as well as MSCCS to the Marshfield Clinic EHR. Experimental results suggest that incorporating the heterogeneity among different patients and different times help to improve the performance in identifying benchmark ADEs from the Observational Medical Outcomes Partnership ground truth

Skip Supplemental Material Section

Supplemental Material

kuang_baseline_regularization.mp4

mp4

415.8 MB

References

  1. Laurent Condat. 2013. A Direct Algorithm for 1D Total Variation Denoising. IEEE Signal Processing Letters (2013).Google ScholarGoogle Scholar
  2. P Laurie Davies and Arne Kovac 2001. Local Extremes, Runs, Strings and Multiresolution. Annals of Statistics (2001).Google ScholarGoogle Scholar
  3. Steven Findlay. 2015. Health policy briefs: The FDA's Sentinel Initiative. Health Affiaris (2015).Google ScholarGoogle Scholar
  4. Jerome Friedman, Trevor Hastie, and Rob Tibshirani. 2010. Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software (2010).Google ScholarGoogle Scholar
  5. Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press. shownotehttp://www.deeplearningbook.org.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Rave Harpaz, William DuMochel, and Nigam H Shah. 2015. Big Data and Adverse Drug Reaction Detection. Clinical Pharmacology & Therapeutics (2015).Google ScholarGoogle Scholar
  7. Rave Harpaz, William DuMouchel, Nigam H Shah, David Madigan, Patrick Ryan, and Carol Friedman. 2012. Novel Data-Mining Methodologies for Adverse Drug Event Discovery and Analysis. Clinical Pharmacology & Therapeutics (2012).Google ScholarGoogle Scholar
  8. George Hripcsak, Jon D Duke, Nigam H Shah, Christian G Reich, Vojtech Huser, Martijn J Schuemie, Marc A Suchard, Rae Woong Park, Ian Chi Kei Wong, Peter R Rijnbeek, and others 2015. Observational Health Data Sciences and Informatics (OHDSI): Opportunities for Observational Researchers. Studies in Health Technology and Informatics (2015).Google ScholarGoogle Scholar
  9. Nicholas A Johnson. 2013. A Dynamic Programming Algorithm for the Fused Lasso and l0-Segmentation. Journal of Computational and Graphical Statistics (2013).Google ScholarGoogle Scholar
  10. Zhaobin Kuang, James Thomson, Michael Caldwell, Peggy Peissig, Ron Stewart, and David Page. 2016. Baseline Regularization for Computational Drug Repositioning with Longitudinal Observational Data. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16).Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. David Madigan, Nandini Raghavan, William Dumouchel, Martha Nason, Christian Posse, and Greg Ridgeway 2002. Likelihood-Based Data Squashing: A Modeling Approach to Instance Construction. Data Mining and Knowledge Discovery (2002).Google ScholarGoogle Scholar
  12. David Madigan, Martijn J Schuemie, and Patrick B Ryan. 2013. Empirical Performance of the Case--Control Method: Lessons for Developing a Risk Identification and Analysis System. Drug Safety (2013). Google ScholarGoogle ScholarCross RefCross Ref
  13. Tom M Mitchell. 1997. Machine Learning (bibinfoedition1 ed.). MGH.Google ScholarGoogle Scholar
  14. Kevin P Murphy. 2012. Machine Learning: a Probabilistic Perspective. MIT Press.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Yu Nesterov. 2012. Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Problems. SIAM Journal on Optimization (2012).Google ScholarGoogle Scholar
  16. G Niklas Norén, Tomas Bergvall, Patrick B Ryan, Kristina Juhlin, Martijn J Schuemie, and David Madigan 2013. Empirical Performance of the Calibrated Self-Controlled Cohort Analysis within Temporal Pattern Discovery: Lessons for Developing a Risk Identification and Analysis System. Drug Safety (2013).Google ScholarGoogle Scholar
  17. Javier Pena and Ryan Tibshirani 2016. Lecture Notes in Machine Learning 10--725/Statistics 36--725-Convex Optimization (Fall 2016). (2016).Google ScholarGoogle Scholar
  18. Valerie Powell, Franklin M Din, Amit Acharya, and Miguel Humberto Torres-Urquidy 2012. Integration of Medical and Dental Care and Patient Data. Springer Science & Business Media.Google ScholarGoogle Scholar
  19. Aaditya Ramdas and Ryan J Tibshirani 2015. Fast and Flexible ADMM Algorithms for Trend Filtering. Journal of Computational and Graphical Statistics (2015).Google ScholarGoogle Scholar
  20. Melissa A Robb, Judith A Racoosin, Rachel E Sherman, Thomas P Gross, Robert Ball, Marsha E Reichman, Karen Midthun, and Janet Woodcock. 2012. The US Food and Drug Administration's Sentinel Initiative: Expanding the Horizons of Medical Product Safety. Pharmacoepidemiology and Drug Safety (2012).Google ScholarGoogle Scholar
  21. Patrick B Ryan, David Madigan, Paul E Stang, J Marc Overhage, Judith A Racoosin, and Abraham G Hartzema 2012. Empirical Assessment of Methods for Risk Identification in Healthcare Data: Results from the Experiments of the Observational Medical Outcomes Partnership. Statistics in Medicine (2012).Google ScholarGoogle Scholar
  22. Patrick B Ryan, Martijn J Schuemie, Susan Gruber, Ivan Zorych, and David Madigan 2013. Empirical Performance of a New User Cohort Method: Lessons for Developing a Risk Identification and Analysis System. Drug Safety (2013).Google ScholarGoogle Scholar
  23. Patrick B Ryan, Martijn J Schuemie, and David Madigan. 2013. Empirical Performance of a Self-Controlled Cohort Method: Lessons for Developing a Risk Identification and Analysis System. Drug Safety (2013).Google ScholarGoogle Scholar
  24. Martijn J Schuemie, David Madigan, and Patrick B Ryan. 2013. Empirical Performance of LGPS and LEOPARD: Lessons for Developing a Risk Identification and Analysis System. Drug Safety (2013).Google ScholarGoogle Scholar
  25. Martijn J Schuemie, Gianluca Trifirò, Preciosa M Coloma, Patrick B Ryan, and David Madigan. 2016. Detecting Adverse Drug Reactions Following Long-Term Exposure in Longitudinal Observational Data: The Exposure-Adjusted Self-Controlled Case Series. Statistical Methods in Medical Research Vol. 25, 6 (2016), 2577--2592.Google ScholarGoogle ScholarCross RefCross Ref
  26. Shawn E Simpson. 2011. Self-Controlled Methods for Postmarketing Drug Safety Surveillance in Large-Scale Longitudinal Data. Dissertation. Columbia University.Google ScholarGoogle Scholar
  27. Shawn E Simpson, David Madigan, Ivan Zorych, Martijn J Schuemie, Patrick B Ryan, and Marc A Suchard 2013. Multiple Self-Controlled Case Series for Large-Scale Longitudinal Observational Databases. Biometrics (2013).Google ScholarGoogle Scholar
  28. Suvrit Sra, Sebastian Nowozin, and Stephen J Wright. 2012. Optimization for Machine Learning. Mit Press.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Marc A Suchard, Shawn E Simpson, Ivan Zorych, Patrick Ryan, and David Madigan 2013natexlaba. Massive Parallelization of Serial Inference Algorithms for a Complex Generalized Linear Model. ACM Transactions on Modeling and Computer Simulation (TOMACS) (2013).Google ScholarGoogle Scholar
  30. Marc A Suchard, Ivan Zorych, Shawn E Simpson, Martijn J Schuemie, Patrick B Ryan, and David Madigan 2013. Empirical Performance of the Self-Controlled Case Series Design: Lessons for Developing a Risk Identification and Analysis System. Drug Safety (2013).Google ScholarGoogle Scholar
  31. Robert Tibshirani, Michael Saunders, Saharon Rosset, Ji Zhu, and Keith Knight 2005. Sparsity and Smoothness via the Fused Lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology) (2005).Google ScholarGoogle Scholar
  32. Paul Tseng. 2001. Convergence of a Block Coordinate Descent Method for Nondifferentiable Minimization. Journal of Optimization Theory and Applications (2001).Google ScholarGoogle Scholar
  33. Stephen J Wright. 2015. Coordinate Descent Algorithms. Mathematical Programming (2015).Google ScholarGoogle Scholar
  34. Stanley Xu, Chan Zeng, Sophia Newcomer, Jennifer Nelson, and Jason Glanz 2012. Use of Fixed Effects Models to Analyze Self-Controlled Case Series Data in Vaccine Safety Studies. Journal of Biometrics & Biostatistics (2012).Google ScholarGoogle Scholar
  35. Tuo Zhao, Mo Yu, Yiming Wang, Raman Arora, and Han Liu 2014. Accelerated Mini-Batch Randomized Block Coordinate Descent Method Advances in Neural Information Processing Systems.Google ScholarGoogle Scholar

Index Terms

  1. Pharmacovigilance via Baseline Regularization with Large-Scale Longitudinal Observational Data

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader